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I. — Introd uction. 

§ 1. In the ordinary theory of statistical correlation, normal or otherwise, we are 
always supposed to be dealing with material susceptible of continuous variation, or 
at least of variation by a considerable number of discontinuous steps. The correla- 
tions of lengths or measurements on portions of the body form examples of the first 
kind ; of numbers of children in families, petals or other parts of flowers, are examples 
of the second. 

Certain practical cases arise, however, where either no variation is thinkable at all, 
or else is not measured or possibly measurable. We may class a number of indi- 
viduals into deaf and not deaf, blind and not blind, imbecile and not imbecile, without 
attempting to go further (although gradations of deafness, blindness, and imbecility 
occur), and demand on the basis of the enumeration a discussion of the association^ 
of the three infirmities. Or again the data may be the mortality from some disease 
with and without the administration of, say, a new antitoxin, the statistics giving 

number who died to whom antitoxin was administered, 

to whom antitoxin was not administered ; 



J3 J> 



* To distinguish it from the " correlation " of continuous variables. 
VOL. CXCIV. — A 258. 2 L 14.7.1900 



258 MR G. IJDKY YULE ON THE ASSOCIATION 

number who did not die to whom antitoxin was not administered, 
,, 5, ,5 to whom antitoxin was administered ; 

and from these data a discussion of the value of the cure is required. Here there is 
no scale of ^^ death " ; there may be a scale of *^ antitoxin '' if the dose varied, but 
not otherwise. 

I 2. Evidently such cases are of great importance, but the theory and means of 
handling them have received little attention from statisticians, Losricians have had a 
monopoly of the theory, but the superior interests of pure logic seem generally to 
have hindered them from developing it in a practical direction. The classical 
writings on the subject are, I suppose, those of De MohgajN','^ Boole, f and Jevons.J 
Without attempting to criticise the work of his predecessors, to both of whom he 
was of course greatly indebted, the method of the latter must be allowed to far 
exceed theirs in clearness and simplicity, Boole's calculus of elective operators is 
highly complex in its working and necessitates the remembrance of many somewhat 
artificial rules ; Jevons' method is practically intuitive. It is a matter of surprise 
to me that Jevons never made any practical application of his method (so far as I am 
aware) during the decade or more that elapsed between the publication of his paper 
{loe, cit.) and his death. The following is a brief explanation of his notation and 
method. 

§ 3. The symbols A, B, C, &c., are used to denote objects or individuals having 
the qualities A, B, C, &c. The terms enclosed in brackets thus — (A), (B), (C), 
&c., denote the frequency of individuals possessed of the quality or qualities 
A, B, or 0, or the total number of such individuals observed in the given " universe 
of discourse.^'! A compound term like AB denotes the class or group possessed of both 
qualities A and B, and (AB) its frequiency ; compound groups may occur with any 
number of specified qualities, ie. (ABC), (ABC.D), or (BDKMN). Corresponding 
to each positive term there is a negative term which we shall denote by a small 
Greek letter|| a, ^, 7, &c. Thus a signifies '' not A," ^ ^^ not B,^^ and so on ; and 
(a), (/3), &c., their frequencies. All symbols are used non-exclusively, A signifying 
objects having the quality A with or without others, and so on, consequently the 
frequency of any class can be expanded in terms of the frequencies of its sub-classes. 

"^ * Formal Logic,' chap. VIIL, '^ On the Numerically Definite Syllogism," 1847. 
t 'Analysis of Logic,' 1847. 'Lav/s of Thought,' 1854. 

I '^On a General System of Numerically Definite Eeasoning," 'Memoirs of Manchester Literary and 
Philosophical Society/ 1870. Eeprinted in ' Pure Logic and other Minor Works/ Macmillan, 1890. 

§ I have used this convenient term of the logicians for the "material discussed " throughout the paper. 
There seems no exact equivalent in ordinary statistical language. 

II I have substituted small Greek letters for Jevons' italics. Italics are rather troublesome when 
reading, as one has to spell out a group like AbcD'E, *^big A, little h^ little c^ big D, big E." It is simpler 
to read A^yDE. The Greek becomes more troublesome when many letters are wanted, owing to the 
non-correspondence of the alphabets, but this is not often of consequence. 
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Thus 



(A) = (AB) + (A/3) 

= (ABC) + (ABy) + (A/3C) + (A/3y) 

= (ABCD) + (ABCS) + (AByD) + (AByS) 

+ (A/8CD) + (A;8CS) + (A^yD) + (A/SyS) 
= &c. 

and also if (U) be the total frequency (total number of observations, total number in 
the "universe of discourse ") 

(U) = (A) + (a) = (B) + (^) = (C) + (y) = &c. 

The whole of Jevons' method, so far as appHed to purely numerical problems, depends 
on the use of equations of the above form, or the expansion of groups in terms of 
their sub-classes. 

§ 4. We shall adopt the following conventions. When requiring to distinguish the 
qualities denoted by English letters from those denoted by Greek, we shall call the 
former jjositive qualities, the latter negative, A group in which all the qualities 
specified are positive will be called a positive group, and conversely. 

A group specified by n qualities (positive or negative) will be termed an nth. 
order group. 

To distinguish the nth order groups in n variables from nth. order groups 
formed from a larger number of variables, we shall refer to the former as ^^ ultimate'' 
groups. 

Two groups such that each quality in the one is the negative (or contrary) of 
each quality in the other will be termed contrary groups, and their frequencies con- 
trary frequencies. Thus 

ABCD a^yS 

aBC A/Sy 

a/3yDE ABCSe 

are pairs of contraries. The case where the frequencies of contrary groups are equal 
is of some importance. This will be called the case of '' equality of contraries.'' 

Consider, for example, the case of normally correlated continuous variables. If 
A denote the class in which some quality is above the average, a the class in which 
the same is below'^' average, and so on with B, /3, &c., then from the symmetry of 
the surface we must have 



■^ Logically "not above average"; but we take the average as a mathematical point, so that there are 
no individuals with exactly average qualities. 

2 L 2 
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(A) = (a) 

{B) = {$), &c., 
(AB) =: {a/3) 
(aB) = (A/8), &c., 

(ABC) = («,Sy) 
(ABy) = (ctySC)^ &c,5 

and so on with groups of any order. 

I shall return later to this case and to the properties that this equality of con- 
traries produces. 

§ 5. It may be noted at this point that groups may often be rapidly expanded by 
using ABC, &C.5 as ^* elective operators" in Boole's sense, and using the general law 
of multiplication of operators, with the special conditions 

UA =:= A, 

ie,^ selecting out the universe of discourse, and then selecting out the A's from it, is 
the same thing as selecting out the A's at once, and the '' index law" 

i.e., repeating the operation of selecting out the A's has no effect on the objects 
included. 

To denote that the letters are being used as operators we will use square brackets. 
Thus 

[ABy] ==: [U -- a] [U - ^] [U - C] 

= [US] ^ [U^a] -- [U^^] - [WG] + [Ua^] + [UaC] + [Ui8C] - [a^C] 

or 

(ABy) = (U) - (a) - (/S) - (C) + (a/8) + (aC) + (,80) - (a/SC). 

We only mention the process as it affords such a rapid and easy means of 
expansion. The results obtained by its use can always be obtained at a little greater 
length by an elementary process of step-by-step substitution. 

§ 6. Before proceeding to the consideration of association and so forth, it seems 
necessary to discuss somewhat fully the general relations subsisting between the 
frequencies of different groups, and the number of independent frequencies of any 
order. Suj)pose, for example, we are dealing with three attributes (A, B, C and 
their contraries). Twelve second order and eight third order groups can be formed 
from these. It might appear then that if the frequencies of the second order groups 
were given, there would be a sufficient number of equations to determine the 
frequencies of the third order groups. As a matter of fact this is not so ; the twelve 
second order frequencies do not form independent data, and the question arises, How 
many are independent ? or, in general, how many independent frequencies or groups 
are there in the mth order groups produced from n variables ? i.e., how many of these 
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mth order frequencies must be given (nothing else being given) in order that the 
remaining frequencies of the same order may be calculated ? These questions are 
considered in the next section (II.). In Section III. correlation or association and its 
measurement are treated ; Section IV. deals with probable errors ; and in Section V. 
some arithmetical examples are given of the methods and results previously discussed. 

IL — General Relations. 

Number of Independent Frequencies. 

§ 7. Before proceeding to the problem above described, we will first prove the 
theorem — . 

'' The frequency of any group whatever can always be expressed entirely in terms 
of the frequencies of the positive groups of its own and lower orders, and the total 
frequency (U)." 

This theorem rnay most simply be proved by the method of multiplying operators 

as described in the introduction, replacing any negative operator like a by A and 

multiplying out. We may, however, effect the reduction by step-by-step substitution. 

Thus 

(A^y) = (A;8) - {A;8C) 

= (A) - (AB) - (AC) + (ABC). 
To take terms of the fourth order, for instance — 

(ABCD) = (ABCD) 

(ABC8) = (ABC) - (ABCD) 

(AByS) = (AB) - (ABC) - (ABD) + (ABCD) 

(AySyS) = (A) - (AB) - (AC) - (AD) + (ABC) + (ABD) + (ACD) - (ABCD) 

(a^yS) = (U) - (A) - (B) - (C) - (D) + (AB) + (AC) + (AD) + (BC) 

+ (BD) + (CD) - (ABC) - (ABD) - (ACD) - (BCD) + (ABCD) 

• t • » I X I. 

Evidently from the form of the last equation cdl the positive groups are required 
to express the frequency of an entirely negative group. 

§ 8. Now to the problem — 

^\ To find the number of independent frequencies of ?nth order groups, the number 
of variables being n." 

The number of positive groups of order m is (number of combinations of n things m 

together) 

n{n—l)\ ... {n — m-\-l) 

' ■ - * ■ 

ml - 

But by the theorem of § 7 the frequency of any group of the mth order can be 
expressed entirely in terms of the frequencies of positive groups. Therefore the 
number of independent mth order frequencies must be equal simply to the total 
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number of positive groups of the mth and lower orders, including (U), the group of 
order zero ; that is, equal to 



1 + ^ + — |- - 2 — + * . . + 



n(7i — 1) e , . {n — m + 1) 



or 



'' The number of independent frequencies of the mth order in n variables is 
equal to the sum of the first (r/i +1) binomial coefficients/' 

This gives the following expressions for the number of independent frequencies of 
the second, third, fourth, and fifth orders- 



Order 2nd 
3rd 
4th 
5th 



)? 



jj 



e 3 



9 « 



* » 



> e 



l{n^ + n + 2) 

^{n^ + ^^ + 6) 

"hin^ — 2^^ + 11^^^ + 14n + 24) 

rioin^ — 5n^ + 25n^ + 5n^ + din + 120). 



The total number of frequencies of any order m is equal to the number of 
positive frequencies of that order (see above) multiplied by 2"^, since each, letter, 
A, B, &c., may be replaced by its negative, and this gives the following expressions 
for the second to fifth orders : — - 



Order 2nd . . 


2n (n — 


1) 




,, OTQ. e e 


e a ITIL \Tl """" 


1) (n — 


2) 


,, 4th . 


. . %n{n — 


l){n- 


■ 2){n - 


5, 5th . 


• • -1-5^ {n - 


~ 1) {n ■ 


- 2) {n 



3) 
- 3) {n ~ 4). 

It is evident from these expressions that, in the general case, the frequencies of any 
order can never be expressed in terms of lower order frequencies. 

Table I. belov/ gives the number of independent frequencies and the total number 
from n = 2 to n = 6 and m = 2 to m = 6^ 



XAHLiEl .!« 



Number of Groups of the 










Number 

of 
variables. 

n. 


2nd order. 


3rd order. 


4t}i order. 


5th order. 


6th order. 


Inde- 
pendent. 

4 

7 

11 

16 

22 


Total. 

4 
12 
24 
40 
60 


Inde- 
pendent. 


Total. 

8 

32 

80 

160 


Inde- 
pendent. 

16 
31 

57 


Total. 


Inde- 
pendent. 

32 
63 


Total. 

32 
192 


Inde- 
pendent. 

64 


Total. 


2 
3 
4 
5 
6 


8 
15 

26 

42 


16 

80 
240 


64 
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§ 9. Case of Equality of Contraries. 

Before proceeding to the determination of the numbers of independent frequencies 
in this case, we shall first prove the following three theorems : — • 

Theorem I. If equality of contraries subsist for frequencies of any given order, 

then it subsists for all lower orders. 
Theorem II. If equality of contraries subsist for any even order of frequencies, say 

2m, then it need not in general subsist for order 2m + 1. If, however, it be 

assumed to subsist for order 2m + 1 ? then the frequencies of this order can be 
expressed in terms of those of the lower order 2m. 
Theorem III. If equality of contraries subsist for any odd order of frequencies, say 
2m — 1, then it must subsist also in frequencies of the next higher order 2m. 
But frequencies of this higher order cannot be expressed in terms of those 
of order 2m — 1. 

The first theorem may be very simply proved. 

§ 10. Suppose we are given, for example, that equality of contraries subsists for 
frequencies of the fifth order ; then we have 

(ABODE) = (a/3yS6) 

(ABCDe) = (a/37SE) 
or adding 

(ABCD) = (a/JyS) 

and so on. The expansions of contrary frequencies are in fact necessarily contrary 
themselves. 

§ 11. Next for Theorem II. To take the simplest case, let us suppose equality of 
contraries given for the second order frequencies. Tal^e any third order group and 
expand it in terms of its contrary and second order frequencies. This may be done 
most elementarily step by step. Thus 

(aBC) = (BC) - (ABC) 

(aj8C) = (aC) - (BC) + (ABC) 

(a^Sy) = {a/3)^ (aC) + (BC) - (ABC). 

Evidently no equality of contraries amongst second order groups will give us 
(a/?y) :=: (ABC). But if we assume this relation to hold we must have 

2 (ABC) = {a/3) -^ (aC) + (BC) 

=:= (AB) + (BC) - (aC) 

an equation which expresses the third order frequencies in terms of the second. 
Similarly if equality of contraries is to subsist amongst fifth order groups when it 
subsists amongst those of the fourth order, we must have 

2 (ABCDE) = (ABCD) + (BCDE) + (ABSe) -- (ABCe) - (aCDE). . . (3). 

As the method of expansion used is evidently quite general, this proves Theorem II. 



' • • • • • • \ MJ J m 
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Equations (2) and (8) are evidently quite special relations. The set of arbitrary 
frequencies in Table II. below is drawn up to illustrate the theorem for the case of 









Table II. 






1. 


2. 


3. 


4. 


5, 


6. 


7. 


Group. 


Frequency, 


Group. 


Frequency. 


Group. 


Frequency. 


Frequency. 


A . . 


91 


AB . 


. 27 


ABO. 


. . 18 


15 


B . 


. 91 


BC . 


. 59 


aBG . 


. . 41 


44 


C . . 


91 


AG . 


. 41 


A^C. 


. . 23 


26 


a 


. 91 


A/5 , 


, 64 


ABy . 


9 


12 


(3 . 


91 


Ay . 


. 50 


af3C . 


. . 9 


6 


r 


. 91 


By . 


. 32 


aBy . 


. . 23 


20 






oB . 


. 64 


A^By, 


, . 41 


38 






aG . 


, 50 


apy . 


. . 18 


21 






/3G . 


. 32 


1 










a/3 , 


, 27 












I3y , 


. 59 










* 


ay . 


. 41 









second and third order frequencies. Column 4 gives a set of second order frequencies 
for which equahty of contraries subsists, the numbers for this having been in other 
respects written down at random. These give the first order frequencies of Column 2. 
If we now proceed to calculate the third order frequencies by equations of the above 



form. 



2 (ABC) = (AB) + (BC) - iaO), 



that is, using the figures of Column 4, 

2(ABC)^ 27 + 59 
== 36 
(ABC) ^ 1 



50 



we get a set of frequencies, Column 6, for which equality of contraries subsists. 

If we take, however, an arbitrary value for (ABC), say 15, and calculate the 
remaining frequencies of the same order from it, we get a set of third order fre- 
quencies (Column 7 of Table II.) for which equality of contraries does not subsist, 
but which is equally consistent with the second order frequencies. 

§ 12. Now apply precisely the same method to a group of the fourth order. We 
get finally 

(a^SyS) = (a/3y) -~ (a^D) + (aCD) -- (BCD) + (ABCD). 

But if equality of contraries subsist for the third order groups, we have by the 
theorems of S 1 1 , 5 1 0— 



2 {{afiy) + (aCD)} = (a^) + (iSy) - (Ay) -f- (aC) + (CD) 

= (a^) + (;8y) + (CD) - (AD) 



(AD) 
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2 {{al3B) + (BCD)} 



= (aj8) + (/8D) - (AD) + (BC) + (CD) - (^D) 
= (a^) + (BC) -f (CD) — (AD) 

= 2 {(aySy) + (aCD)} 
,-. (a/3yS) = (ABCD) 

i.e., if contrary frequencies are equal in the case of third order gtviips they are equal 
in tlie case of fourth order groups. 

The method of proof is again quite general in its appli(3ation, so Theorem III. is proved. 

I have thought it again worth while to illustrate the theorem numerically, and have 
drawn up Table III. for the purpose. A set of arbitrary fourth order frequencies 
(set (l) ), with contraries equal, was first written down, and from them the given set 



J_A.BIjE XX j.. 





(1). 

34 
24 

• 57 
68 
42 
29 
39 
37 
37 
39 
29 
42 
68 
57 
24 
34 


(2). 


(3). 


1 




ABCD . 
ABCS . 
AByD . 

A/5CD . 
aBCD . 
AByS . 

A^C8 . 
aBC8 . 
A^yD . 
aByD . 

a/^CD . 
A/5yS . 
aByS 
a/^CS . 
a/^yD . . 
a/5yS 


• 


30 

28 
61 
72 
46 
25 
35 
33 
33 
35 
'• 25 
46 
72 
61 
28 
30 


11 

47 
80 
91 
65 

6 
16 
14 
14 
16 

6 
65 
91 
80 
47- 
11 


ABC and ajiy . 
A.BD „ a/38 . 

ACD ,, ayS . 
BCD „ /5yS . 
aBC „ A/5y. 
A/^C „ aBy . 
ABy „ a/3C , . 
aBD „ Ap8 . 
Ai8D „ aB8 , 
ABS „ afSB . 

aCD ,, Ay8 , . 

AyD „ aC8 . . 
1 ACS „ ayD . 

\ ^CD „ By8 . 

ByD „ pes . . 
: BCS „ ^yD. 






58 
91 

102 
76 
79 

107 
86 
81 

105 
53 
71 
94 
63 
97 
96 
61 




660 


660 


660 


■ 





of third order frequencies calculated. Now our theorem tells us that the equality 
amongst the fourth orders depends solely on equality amongst the thirds ; so that 
we ought to he able to get any number of sets of fourth order frequencies, all 
possessing equality of contraries, and all consistent with the given set of third order. 
That this is so will be at once evident on trial. Take 

(ABCD) = 30 
for instance. Then we have at once 





(ABCS) - (ABC) • 


- (ABCD) - 58 - 30 - 28 




(AByS) =: (AB8) - 


- (ABCS) — 53 - 28 — 25 




(AyQy8) = (AyS) - 


- (AByS) - 71 - 25 - 46 




(afiyh) (;8yS) - 


(AySyS) — 76-46-30 


giving 


(ABCD) = (aySyS) : 


= 30. 


VOL. CXCIV.- 


-A. 
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Similarly all the other frequencies may be calculated, and we get set (2). 

If we take (ABCD) = 11 we get set (3), All three sets are consistent with the set 
of third order, and possess equality of contraries. The state of affairs is precisely the 
opposite of that illustrated by Table 11., where only one set of third order frequencies 
could be obtained, consistent with the given set of tlie second order, and possessing 
equality of contraries. ^^ 

It should be noted, however, that the possible number of fourth order sets in such 
a case as the present is not infinite, for certain limits are imposed by the fact that 
negative frequencies are impossible. Thus, if we take 

(ABCD) === 60 
we have 

(ABCS) trr 58 — 60 = — '^ 

or II 



(ABCS) ^ 


= 58 - 


-■ 60 = - 


(ABCD) = 


= 3 




(ABCS) = 


=: 58 - 


-• 3 " 55 


(AByS) = 


= 53 - 


- 55 - - 



SO (ABCD) must lie at all events between the liuiits 58 and 5. 

§ 13. It follows from what we have proved that a state of com2^lete equality of 
co7itraries^ in which this state subsists for groups of all orders, is not and cannot be 
an artificial state created by choice of the points of division between A and a, B and /3, 
and so on, but must arise from some real and natural symmetry in the distribution of 
frequency. In dealing then with the next problem, to find the number of independent 
frequencies of any order in the case of complete equality of contraries, we must not 
rashly apply the formulae obtained (by extrapolation, as it were) to an empirical case 
in which we only know that the condition subsists for a few low orders. 

The general result we arrived at was that the number of independent frequencies of 
the mth order in n variables was given by the suju. of the first {m +1) terms of the 
series 

In this expression we may now strike out alternate terms commencing with n, for 
these represent frequencies of odd order which can be expressed in terms of the next 
lower order of frequencies, and so do not give any independent data. This leaves the 

« 

series 

^ , n{n - 1) , n(n - 1)(» -~ 2)(« - 3) 

1.2 ^ 1.2.3.4 "T . ^ ., 

^ Note 4/4/00. — I only noticed in reading Tables II. and III. in proof that the theorem holds '^ If 

two sets of ultimate mth order frequencies are both consistent with a given set of m-lth order, the 
differences between corresponding pairs of mth order frequencies are numerically constant.'^ It is 
noted below (pp. 272, 273) that this holds for second order frequencies. The theorem is proved at once 
by expanding the (m - l)th order frequencies in terms of the two sets of the mth order. 
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and the number of independent frequencies of order 2m in n variables is eqUal to the 
sum of the first m + 1 terms of this series. 

The number of independent frequencies of order 2m + 1 is of course equal to the 
number of independent frequencies of order 2m. 
These rules give the numbers in Table IV. below. 



Table IV. Complete Equality of Contraries. 



Number 


Number of independent frequencies of order, 


of 




variables. 


2 3 4 5 6 


2 


2 


3 


4 4 


4 


7 7 8 


5 


11 11 16 16 


6 


16 16 31 31 32 



§ 14. In the ' Phil. Trans.' for 1898 a very striking theorem was given by Mr. W. 
F. Sheppard,^ expressing the frequencies like (AB), (A^), &c., in quadrants of the 
normal surface in terms of the coefficient of correlation r. Our equations like (2) in 
§ 11, on p. 263 above, enable us at once to extend the use of this theorem to the case 
of three variables. Let us take two examples from the case of Heredity on the 
assumption of Gallon's Law. 

(1) If the father and grandfather of a man are both above the average as regards 
any one character, what is the chance that he will be above the average ? 

The following are the correlation coefficients : — 



bon and father . 

Son and grandfather . 

Father and grandfather 



. + -3000 
. + -1500 

. + -sooo 



Mr. Sheppakd's theorem then gives the following for the frequencies per 10,000 



above and below average . 



Son. 



above below 



> 
o 



2985 



2015 






CD 



2015 



2985 







Son. 








above 


l)elow 




O 






p-l 


> 

c 


2740 


2260 


03 


r^^' 






^ 


^ 






















— 1 


o 


2260 


2740 


O 









The first scheme holding for father and grandfather as well as son and father, 

^ ^Phil. Trans.,' A, vol. 192, p. 101. 
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Now if we use ABC for son, father, grandfather, and the capitals to denote ^^ above 
average," Greek letters '' below average," we want 

(ABC)/(BC) 
(BC) =^ 2985 at once. From § 1.1, p. 263, 

2 (ABC) =: (AB) + (BC) - (aC) 
"-::r 2985 + 2985 — 2260 
:^ 3710 

(ABC) ==1855, 

chance required =^ 1855/2985 ^ '6214. 

If only the father be known to be above average, chance of son being above avera.ge 
::=: 2985/5000 = -5970. 

If, on the other hand, we ask what is the chance that the child will be above 
average if both the father and mother are so, we have, assuming the correlation with 
both parents to be the same, using B for father, C for mother, and assuming no 
assortative mating : — 

2 (ABC) = 2985 + 2500 — 2015 ^ 3470 

(ABC)r^ (173 5) 

chance = 1735/2500 ^ '6940. 

But if there be perfect assortative mating 

2 (ABC) ::::: 2985 + 5000 - 2015 

=:= 5970 (ABC) = 2985 
chance ™ 2985/5000 - '5970. 

Thus, if there be no assortative mating, a selection of father and mother is better 
than a selection of parent and grandparent ; but not so if there be assortative mating 
to any great extent. 

§ 15. The relations that we have dealt with in the preceding pages have a general 
bearing on the theory of certain multiple integrals. If we imagine, as we have 
already done on several occasions, that the distribution of frequency is really con- 
tinuous and the points of division between A and a, B and yS, &c., arbitrarily fixed, 
then any ultimate frequency Hke (ABCD) (A/3CS), for example, is equivalent to the 
multiple integral expressing the total frequency contained within the four axes of 
the frequency surface (or hyper-surface), taking each of these axes in either the 
positive or negative direction. 

Now we know that there are 2" ultimate groups (or multiple integrals of the 
above kind) to be formed from n variables, all of these groups being in the general 
case independent. Suppose the function expressing the distribution of frequency to 
contain m constants that remain in the expression, ^(o^i5%,^3, . . • ct,^), for the multiple 
ujfcegral. Then we have the equations 
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(ABCD . . .) "= ^1 (^b%?%) • ' ' ^«0 

fotJjvvU . . .) ^^^ (pg ( Otj|^,(X.2j(Xg, . . . CJjp^) 

&C, &C. 

or 2" equations altogether. But if m < 2'\ we can express the constants in terms of 
the frequencies by means of the first m equations, and then insert their vahies in the 
remaining equations, thus obtaining 2" — • m necessary relations between the fre- 
quencies. If the surface we are dealing with is symmetrical, there will be only 2''"^ 
independent ultimate groups, and 7n must be less than 2''"^ if special relations are to 
subsist between the groups. 

Now this is the case in the normal surface itself. The standard deviations will not 
appear in any of the multiple integrals, which must be functions solely of the correla- 
tion coefficients, r^^, r^g, r^^, &c., and the total frequency. That is to say n variables 

give 1 + — ,r — constants that appear in the expressions for the total frequencies of 

the ultimate groups. This gives the following figures : — 



n 1 + 



2 
3 



4fc 

5 

6 





2 


2 


2 


4 


4 


7 


8 


11 


16 


16 


■ 32 



There must therefore be one relation subsisting between the ultimate fourth order 
groups in normal correlation— -besides the mere equality of contrary frequencies — five 
relations between the fifth order groups, sixteen between those of the sixth order, 
and so on. If we could find these relations the expression of fourth order frequencies 
in terms of third, sixth in terms of fifth, and so on, would cease to be indeterminate 
as in the general case of equality of contraries. Mr. Sheppard's theorem could then 
be extended to the case of groups of any order in normal correlation, which would 
give results of great interest for calculating certain chances, e,g,, the chance of a man 
being above average when his father, father and grandfather, father, grandfather, 
and great-grandfather were above average. 

Finding myself quite unable to solve the above problem, I have handed it to 
Professor Karl Pearson ; he informs me that the relations sought depend on 
equations between the area, sides, and angles of the generalised spherical triangle in 
hyper-space, but the problem has not yet been solved. It is curious that investiga- 
tions into the theory of logic should lead to properties of hyper-spherical triangles 
or tetrahedra. 
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in. Association. 

§ 16. Two qualities or attributes, A. and B, are defined to be independent if the 
chance of finding them together is the product of the chances of finding either of 
them separately, i.e., i:? 

B) (A) (B) 



(IT) "~ (U)" (U)' 

or (AB)(U) = (A){B). 

This is, I think, the only legitimate test of dependence or independence — association 
or non-association — in the general case. 
§ 17. Theorem. — To show that if 

(AB) (U) = (A) (B) 

then 

( A/8) (U) = (A) (/8) 

(ccB) (U) = (a) (B) 
afi) (U) =^ (a) {/3). 



Take the first equation of these three— 

(A)(/8) = (A) {(U) - (B)} == (U) (A) - (AB) (U) 

and so on for the others. So that if the chance of finding the two qualities together 
is the product of the chances of finding either of them separately, the chance of 
finding the one without the other is the chance of finding the one multipHed by the 
chance of not finding the other, and so on. .Any one of the relations implies all the 

others, 

§ 18. It follows at once from the above that if two attributes (A) and (B) are inde- 
pendent, the products of the contrary second order frequencies are equal, ie., 

(AB) {a/3) =: (A^) (aB), 

for each is equal to (A.) (B) (a) (^) divided by (U)^ Not only so, but the converse is 

also true— 

§ 19. If the cross-products are equal the vmiables are indejjenderit. Thus let 

(AB) (a^) =: (A/3) (aB). 

Now 

(A^) = (A) - (AB) 
(aB) = (B) »- (AB) 

(a^) = (13) - (Afi) 

— (fi) — (.A) + (AB). 
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Therefore 



(AB){(^)-~(A} + (AB)} 
(AB) m ^ (A)} 

(AB){(B) + (^)} 
(AB)(U) 



l(A)-(AB)H(B)-»-(AB)} 
(A)(B)~(AB){(A) + (B)} 
(A)(B) 
(A) (B). 



§ 20/^ Now it seems to me that one of the chief needs in handHng statistics of the 
kind we are considering is some sort of '' coefficient of association/' which should 
take the place of the ''coefficient of cori^elaMon'' for continuous variables, and be a 
measure of the approach of association towards complete independence on the one 
hand and complete association on the other. Such a coefficient should — 

(1) Be zero when the variables or attributes A, B, are independent, and only when 
they are independent. 

(2) It should be +1 when, and only when, A and B are completely associated, i,e,, 

when either 

all A's are B 

all /3's are a 

or all B's are A 

all as are /8 

or when both of these statements are true together, which can only be when 

(A) = (B), (a) = (^). 

The three diagrams below illustrate the three cases which correspond to 

{Ai8) = , (aB) = , (A/3) = (aB) = 0. 





(B) 
(AB) 


(/3) 




j 
i 

i 


(B) 


(A/3) 
(a/3) 






(B) 


(^) 


(A) 





(A)- \ 


(AB) 


(A) 


(AB) 





(a) 


(aB) 


(a/3) 


(«) 

j 

1 
1 





(a) 





(a/3) 



(3) It should be —1 when, and only when, A and ^ or B and a are completely 

associated, i.e., when either 

all A's ai*e yQ 

all B's are a 

all yS's are A 
all as are B 

"^ Note added 19/1/00. — It has several times occurred to me as quite possible that I have limited myself 
too much in this section by defining the case of " complete association '' as equivalent simply to the logical 
case. An association coefficient of greater analytical convenience might have been obtained by defining 
attributes A and B as completely associated only when all A's were B and all B's were A. The distinction 
of the logical ease by a definite value of the association has, however, obvious conveniences. 
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or when both of these statements are true, whicli again can only be if 

A ^ (a) (B) = {fi\ 

The three diagrams below illustrate these cases of negative association which 

sponcl to 

(AB) = , (a^) ^ , (AB) 3r (a^) =: 0. 



corre 



A 


j 



i 

1 


/^ 


- 




B 


Kj3 



A/3 


A 

■ 


AB 


a 


aB 

j 
i 


a/3 


a 


aB 





B 




i 


A 


1 
1 




a 


i 

i aB . 



^21. The theorems just given show that 



Q 



(AB)(a/9) -- (A;8)(aP.) 
(AB)(«/3) + (A/3)(aB) 



(1) 



will serve as such a coefficient of association for--™ 

(1) When A and B are independent the numerator is zero and therefore Q zero ; 
and conversely when Q is zero the variables are independent. 

(2) When (A/3) =• or (aB) ~ 0, or both, Q — + 1, ; and conversely when Q = + 1 
(A^) - 0, (aB) =^ 0, or both. 

(3) When (A/3) = or (aB) =: 0, or both, Q :=::: —1 ; and conversely when Q rir -^ 1 

(AB) = or (a^) = 0, or both. 

It is perfectly possible that other simple functions of the frequencies might l)e 
devised which should have the same properties, but Q at any rate will serve; 1 do 
not wish to attach too great importance to the identical function employed. If we 
choose Q for such a purpose, however, its properties must be investigated. 

S 22. The numerator, or difference of the cross-products, has, as Professor Kabl 
Pearson lias pointed out to me, a very simple and important physical meaning. It 
follows immediately from the equations used in § 19 for showing that when the cross- 
products were equal A and B were independent ; namely^ — 

(AB) (a^) - (A^) (aB) = (AB) (U) - (A) (B) ; 
or if (AB)o be the value (AB) would have if Q were zero 



(AB) (a^) - (AiQ) («B) 



(U)KAB) 

(U) {(a^) - 

(U) I (A/3)o 
(U) {(«B)o 



(AB)„i- 

(aB) } 
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That is to say, " The excesses of (AB) and (a^) above (AB)q and (a^)Q, and of 
(A^)o and (f^B)Q above (A/8) and (ctB), ai^e all equal, and equal to the ratio of the 
difference of the cross-products to the number of observations." This theorem seems 
to me rather remarkable. I can find no similar relation for the sum of the cross- 
products so as to give a complete physical meaning to Q. 

§ 23. Next let us determine any one of the second order frequencies, e.g. (AB), in 
terms of Q and the first order frequencies. 
If we write 



X \a'. 



K 



°<j 



1 + Q 



% 



we have 



Now 



K (AB) (a^) == (aB) (A^) 



« e 9 » • « 



(aB) 



(B) - (AB) 
(A) - (AB) 
(^) - (A) 4- (AB), 



whence 



(2) 



. 1 /, 



(AB)2(1 -k)- (AB) { k(U) + (1 - k) [(A) + (B)] } + (A) (B) ^ 



which is a quadratic for (AB). 

Now let 



(A) — (a) 

(A) + (a) 

(B) - (/3) 
(B) + {^) 



^ 



y- 



'2 



(4), 



^ 



where s, s^ may be called the surpluses of A and B. It follows that 



(A) = i(U)(l + .0 
(«) = i(U)(l-.9,) 



> 



■ ■ (5) 



and similarly for (B) and {^). In terms of these symbols the quadratic may be 

written 



whence 



{ABf - ( AB) (U)^S:^J^+J^ + Oil 0,+lMk+J^ = 

2 (1 — k) 4 (1 — /c) 



4 (1 - k) 



2 + ( 1 — k) f Sj + Sa) 



db %/{si — s^f + «[4 — (S; 



i3 



^"OL. OXCIV, — A, 



2 N 



• (6), 
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oi^ replacing k by the oi'iginal Q, 

(AB) r. ^^>| 1 + Q (1^ + s, + s,)± y 1 -l'Q5;^~^cr{l - sf-si) j . . (7). 

§ 24, The question arises, what is the meaning of the alternative sign in the 
expression for (AB) ? One of the values given is, as a matter of fact, only a numerical 
solution, and is really impossible, so that the value of (AB) is not indeterminate. We 
may write (7) in the form 



(AB) 1 



1 



(A) 2Q (1 + s,) 



1 + Q (1 + ^i + ^2) ± a/1 - 2(^,^, •- Q^ (1 "^^'sf^ s^ I (8), 



and V^-r- must be less than 1 and c>:r eater than 0. 

(A) ^ 

The product of the twT) values given by the + and --- sign is 

^2Qa>^r' 

If Q be negative this is negative, and so the lower value is negative or impossible ; 
we must consequently use the + sign. On the othei' hand, if we subtract 1 from 
each value above and again form the product, it is 

(Q ^1)(1 ^,,) 

2Q(1 + .^) ' 

and this is negative if Q be positive ; hence one of the values is greater than unity. 
When Q is positive we must therefore use the — sign to the radical. 

If Q n::: 4- 1, Que of the roots is unity and the other greater or less, according as 
6*2 5- .s'j. Thus, if Q ™ -f- 1 we have for (AB) 

(AB):::.^^(1+.,) Or ^{l + S^, 



i.e., (AB) == (A) or (B). 

If Q is —1 on the other hand, one root is zero 

(AB) = or '~i^{si + %). 

^25. The values of all four groups are as follows, tlie lirst sign of the radical to be 
used when Q is positive :— 

(AB) = <^]^> 1 1 + Q (1 + s, + s,) --^ ^i-^::r2Q^;^-^rq^-{nrjTzrsj) 

{Kfi) = ® j- 1 + Q(l + 6v- %) rt ./ ~ "~~ i 
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(«B) = ^^|~ 1+ Q{1 + % - s,) ± 



(«/3) 



4Q 



j^H- Q (1 ~s,-s,)T^/ 



If Si = §2 = 0, ie., if equality of contraries subsist amongst the first and. second 
order groups, we have 

(U) 



(AB) 



4Q 



{l + Q±^/l-Q'] . . . 



ft « « 9 



(9). 



(U) . 1 ^ 
2 1 +v//c 



The following short table gives the values of k and \/k for different values of Q. 
The values of k and \/k corresponding to negative Q's are the reciprocals of those 
corresponding to the positive values : — 



Q. 


K. 


s/k. 




'0 


1 


1 




•1 


•8182 


•9045 




•2 


•6667 


•8165 




•3 


•5385 


'7338 




•4 


•4286 


•6547 


+ - 


•5 


•3333 


-5773 




•6 


2500 


-5000 




•7 


•1765 


•4201 




•8 


•1111 


•3333 




•9 


•0526 


•2294 




1-0 










' -1 


1-2222 


1-1056 




'2 


1 -5000 


r2247 






I'SblO 


1-3628 




•4 


2^3333 


1-5274 




•5 


3-0000 


1-7321 




•6 


4^0000 


2-0000 




•7 


5-6667 


2-3804 




•8 


9-0000 


3*0000 




•9 


19-0000 


4-3589 




1-0 


CO 


CO 



Association and Correlation. 

§ 26. The theorem already referred to, due to Mr. W. F. Sheppard,'^^ forms a 
connecting link between Bhavais' coefficient of correlation and the association 
coefficient in the case of normal correlation. If the divisions between A and a, 
B and /S, &c., are the means of the corresponding variables 



^ ' Phil. Trans./ A, 1898, vol. 162, p. 101. 

2 N 2 



i7ic> 
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where 



K 



(AB) 

— cos -t-Ztt = 

(A) 

COS r -- TT 

1 4- \/ K 



1 - Q 

l" + Q 



cos y\--- 7T 
(A) 



as before. 



> 



10) 



The figui^es below give corresponding values of Q and r : 



Q. 




■ 

Q. 


r. 







! 


1 

•6 


•500 




•1 


•079 


•7 


•598 




•2 


•15& 


•8 


•707 




•3 


•239 


•9 


•833 




•4 


•322 


1-0 


1-000 




•5 


•409 

1 


I 







Q is always slightly in excess of r, the greatest difference being rather more than 
•1 for Q = 7. 

§ 27. In the general case the value of Q is necessarily a function of the position of 
the origin, or of the arbitrary axes which are chosen for dividing A from a and B 
from /3. The evaluation of Q for any pair of axes in the case of normal correlation, 
depends on that of certain definite integrals which have not yet been tabulated. To 
get some idea of the general character of the dependence I have calculated the value 
of Q for every possible pair of axes in the annexed (observed) frequency table ; the 
frequencies'^ being the small figures, and the values of Q those entered in heavy type 
at each origin. An inspection of the table will show that Q is a minimum for axes 
near the mean of the whole table, and a maximum for origins near the limits. At the 
extreme boundary the values vary suddenly and erratically, owing to the necessary 
discontinuity of the observed frequencies, and here we may get values ^ 1 for the 
association. In other parts of the table, however, negative values only occur in most 
exceptional positions, and appear to be due to accidental irregularities. The sign of Q 
agrees with that of the correlation coefficient r over almost the whole table. 

§ 28. It does not seem possible to obtain for Q a function that shall not vary with 
the position of the axes in the general case, so long, at all events, as we adhere to 
certain conditions of symmetry for the function Q that seem to me almost necessary. 
It may perhaps be possible for a strictly normal frequency distribution. 



^ There is some slight error, possibly due to copying, in the frequencies of the table, as the totals of 
rows and columns occasionally contain odd quarters, whereas they should only contain odd halves, I do 
not think this is of any practical consequence. 
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There is one case, and one only, where Q is independent of the axes chosen, and that 
is where the variables are strictly independent. Let fm. fj be the elementary 





H 


H 


H 

F,;, 
F,3 

F33 


/l 


Fn 


Fro 
F:a 
F33 




Fai 

F31 



frequencies corresponding to values x.^, yn of. the variables, and let F.^^^ be the 
frequency of the pair (.Ty,^ y,), Tlien, if the variables are strictly independent, we 
must have in every case 

N being the total number of observations. Therefore, summing over any one 
quadrant, whatever the position of the axes, 



or 



-s(/;,)x s(/;/) 

N (AB) rr. (A) (B) 



and so on, so that Q is zero for all axes. It is impossible to create an artificial 
association, out of real independence, by mere choice of special axes. This is a most 
Important limitation. At the same time it must be borne in mind that where the 
variables are not independent, as in the table on p. 277, Q may be changed in 
sign or rendered vanishingly small by the choice of special (possibly exceptional) 



axes. 



# 



The whole subject of the connection between correlation and association demands 
further investigation, as it bristles with difficulties and possibilities of fallacy. In 
some practical cases there seems no doubt that the signs of Q and r would be different, 
and, indeed, the physical meaning attached to their interpretation. In the present 
paper, however, 1 do not deal further with the subject. 



^ Of. also the exainj3le of assortative mating according to stature from Mr. Galton's 'Natural 
Selection/ p. 82. 
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Partial Associations wrid Associations between Groups of Attrilmtes. 

§ 29. In the value of Q, as written in equation (1), p. 272, the '^ Universe of 
Discourse" is understood, not expressed. If '"A" represent, say, deafness, and ''B" 
blindness, we are probably dealing with the association of these infirmities within at 
most one nation, e.g., English, or even one sex of the nation, e.g,^ English men. 
Letters are not given to represent that the iniiverse is so limited, it being generally 
obvious from the context, but if we take D = English, E = men, we can write Q 

_ (A BDE) (a^DE) - ( aBDE)(Ay3I)E) 
^^ "^^ (ABDE) (a/3DE) -f (aBDE) (A/SI)E) ^^ ^' 

adding the letters DE to every group. Such a coefficient of association will be 
termed h> 2^cirtial coefficient, as distinguished from the total coefficient of equation (1). 
We may speak of partial coefficients of the 1st, 2nd, .... nth orders, according as 
the universe is limited by the s|)ecification of 1, 2, 3 .... n attributes. These 
partial and total coefficients of association correspond roughly in their nature to 
partial and total coefficients of correlation. In the latter case, however, w^e limit the 
universe by specifying that in all members of the universe variable x shall have the 
fixed magnitude h ; in the former case we only specify that x shall exceed h or be less 
than/i. 

The following notation for coefficients of association seems concise and convenient. 
The total association between A and B we shall denote by AB between two vertical 
lines — thus |AB|. The partial association in the universe of C's, CD's, OS's, or 
CSe's we shall denote by ( AB I C i , AB CD I , I AB C8 ] , AB CSe I . 

§ 30. The number of possible partial coefficients becomes very high as soon as we 
go beyond four or five variables. Supposing m attributes are given, we can form 



' 7h 



partial coefficients of the nth order {n < rn --2) hetiveen any one 2^<^i^^ of attributes. 
For we can form 2" different universes with n attributes, and choose n attributes out 
of (m — - 2), in 

(m - 2)(rfi ™- 3) . . . (7ii - % 4^ 1) 



n 



different ways. But the number of possible pairs of attributes (AB, AC, BC, &c.) is 
4m (m — ■ 1), and therefore the total number of possible partial coefficients of the 

nth order, 

o«-i ^^^O^^ — l) 0^^— 2) . . . (wi—n — l) 

Z ^ ■ — — 



These expressions give the following figures :- 
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JS' umber of 

attributes wt. 


Number of partial coefficients of order 7b: (1) between any one pai 

(2) altogether. 


.r of attributes ; 
5. 


n = 1. 


1 

2. 


3. 4. 


3 

z. 



6 

7 
8 


(1) (2) 

2 6 

4 24 

6 60 

8 120 

10 210 

12 336 


(1) (2) 

4 24 
12 120 
24 360 
40 840 
60 1680 


(1) (2) 

8 80 

' 32 480 

80 1680 

160 4480 


(1) (2) 

16 240 

80 1680 

240 6720 


(1) (2) 

32 672 
192 5376 



§ 31. But besides these partial coefficients there are others that we may form, where 
we deal with the association between two gfviips of qualities or attributes, or between 
a single attribute and a group. These coefficients arise naturally out of the total 
coefficients ; for in any total coefficient a single letter may really represent an 
aggregate of qualities that we may more completely denote by a group of letters. 
Thus A may represent deaf-mufcism, C imbecility, and 1 AC| the association between 
deaf-mutism and imbecility ; but if we amplify the notation and use A = deafness, 
B = dumbness, C === imbecility, the association between deaf-mutism and imbecility 
will be represented by 

(ABC) (a/37) - («i8C) (AB7) 



AB. Gl 



(ABG)(a^y) + (a/3G) (ABy) 



• (12). 



This I AB . C I is quite distinct from | AB | C j . The latter measures the association 
betw^een A and B in a group of individuals all possessing C. The former measures 
the association between C and the compound attribute AB. 

A more general form of association coefficient is such a ^' group coefficient ^' with 
the universe specified, i,e,y a partial group coefficient. For example 



AB. CD|E1 = 



(ABODE) (a/37SE) ~- (a0GDl&) (AByBE) 
(ABODE) (ayS7SE) + (a^CDE) (AB7SE) 



. (13). 



The Method of Serial Chances, 

§ 32. There is a very common method of handling such associations as we have 
here to deal with, more especially where it is desired to discuss the association of 
some one attribute A with a series of others B, C, D, &c. The chances (AB)/(B), 
(AC)/(G), &c., are simply tabulated in order of magnitude, and the attribute X for 
which the chance of X being A, or (AX)/(X), is greatest is held to be the '' most 
important cause of A." 

The method seems to have been first brought forward, as a definite statistical 
method, by Quetelet, in a pamphlet published in 1832, ' Sur la possibilite de mesurer 
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rinfluence des causes qui modifient les elemens sociaux/'^' but it is either explicitly 
or implicitly used in most statistical discussions of causation. To take an example 
from Quetelet's pamphlet, I give below a table of the chances of condemnation of 
various categories of prisoners in the French Assize Courts during the years 1825-30. 
Here A must stand for condemnation, B, C, D for the various attributes of the 
accused (superior education, being a woman, being a man, &c.). The chances 
tabulated are then (AB)/(B), (AC)/(C), &c., except No. 8, which is (A)/(U). 
QxjETELET went further than a tabulation'' of the simple chances, and used as -a 
measure of the '' degree of influence '' of the cause the function 



<!> 



(AB)/(B) - (A)/(U ) 
(A)/(U) 



(14 



these measures being given in the second column. 



Etat de Faccuse. 



1. Ayant iine instruction superieure .... 

2. Condamne qui est venu purger sa contumace 

3. Accuse de crimes contre les personnes 

4. Sacliant bien lire et ecrire 

5. Etant fenime .... 

6. Ayant plus de 30 ans . 

7. Sachant lire et ecrire imparfi 

8. Sans designation (mmme . 

9. l5tant iiomme .... 

10. Ne sachant ni lire ni ecrire 

11. Ayant moins de 30 ans . 

12. Accuse de crimes contre les proprietes 

13. Etant contumax 



aitement 



Probabilite 
d'etre condamne. 



Degre relatif 

cVinfluence de 

Tetat de Faccuse 

sur la repression. 



400 
476 
477 
543 
576 
586 
600 
614 
622 
627 
630 
655 
960 



+ 
4- 
+ 



348 
224 
223 
115 
062 
045 
023 
000 
013 
022 
026 
067 
563 



§ 33. Now from the work in § 22, p. 272, we have at once 



(AB) _ (A) _ (AB)--(A)(B)/(U) _ (AB) 
(B) (U)-^ (B) " -^ 



(Al^o _ (AB)(a/3) ~- (aB) (AyS) 



(B) 



(U) (B) 



(15), 



so that if (AB)/(B) > (A)/(U), A and B are certainly positively associated, and if 

{AB)/(B) < (A)/(U) negatively , associated. It does not follow, however, that if 

(AB)/(B) > (AC)/(C), A and B are more closely associated than A and C. If we 

write 

Pi = (AB)/(B) 

p, = (AC)/(C) 
and if k^ k^ are the values of the functions k for AB and AC, then 

^ It is entitled " Lettre a M. Willerme de Tlnstitut de France." Bruxelles, 1832. (Eoyal Statistical 
Society's Library. — Tracts, S. 4, vol 5.) 

VOL. CXCIV, — A, 2 O 
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L+J^i ^ PlQ: - PO (1 -F2)(l + g,) + (1 + s,) . . 

1 + K, p,(l - p^ (1 ^ pMI + %) + 0. + s,) ^ ' • ^'"^^^ 

.9p %5 5o being the surplus ratios for A, B, and C, Hence if 5.3 = % the i*ight-hancl 
side is certainly less than unity, and 

Ki < K2 

or {^-^ > ijg 

That is to say, if the surplus ratios ofB and G are the same, Qi > Qg when j:?! > p^ ; 
but if they are not, this result does not follow. We can, then, only refer to 
equation (16). Qtjetelet's function is for this purpose the same, in effect ; as he only 
divides the difference of the chances by (A)/(U). We may write his function in the form 

p^ (.A)(By .,.«..,. (I/). 

§ 34. Now it seems to me that association coefficients and Quetelet's functions, or 
chances like (AB)/(B), &c., roughly correspond in their uses to correlation coefficients 
and regressions. The correlation coefficient is a symmetrical function of the variables, 
ranging between ±1? and is zero when the variables are independent. The associa- 
tion coefficient is a symmetrical function of the attributes, ranging between ±1, and 
is zero when the attiibutes are unassociated. The regressions are zero when the 
correlation coefficient is zero, but are not symmetrical functions of the variables ; they 
depend on the values of the standard deviations as well as the correlation; and 
even if the regression of x on y be greater than the regression of x on z, it does 
not follow that r^j^ > r^^ unless a^ = cr.. The Quetelet functions (or simply 
differences of the chances ( AB)/(B) -— (A)/{U)) are zero when the association coefficient 
is zero, but are not symmetrical functions of the variables ; they depend on the 
values of the surplus ratios as well as the association ; and even if c^ for AB be 

OTeater than 6 for AC or 

{AB)/(B) > (AC)/(C), 

it does not follow that | AB | > | AC | unless ^c = s^. Finally the regressions of 
X on y, z, &c., may be said to measure the " relative degrees of influence" of unit 
alterations in y, z, &c., on x, just as Quetelet takes his function to measure the 
" relative degrees of influence " of B, C, &c., on A. Thus, referring to the table given 
(p. 281), he remarks, " on voit par la qu'une instruction superieure exerce une influence 
cinq fois plus grande que lavantage d'etre femme,'' since '348 is some five or more 

times '062. 

I confess I do not altogether like Quetelet s function, as there does not seem to me 
any point in this sort of case in dividing by ( A)/(U), or p^ in our previous notation. 
If Pq was in one case '9 and in another '45, it seems absurd to count an attribute 
that raises j^o by '05 in either case, half as effective in the former case as the latter; 
one would rather consider it more effective in the former case. 
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^ § 35. I do not profess to have given in the foregoing pages more than an outline of 
the theory of the case with which the statistician has to deal ; in stronger hands it 
could probably be carried much further. The method I have suggested has the 
advantage of bringing the case of association somewhat into line with that of 
correlation ; assimilating the method and conceptions of the case of association to 
those of the better known field. 

The statistician has to handle problems of peculiar difficulty, where the association 
may have any value. The logician demands Q — ±1 before he will consent to 
infer, and limits himself to this special and elementary case. At the opposite pole 
to that of the logician we may imagine a '' logic of independence," w^here Q is always 
zero — a case hardly less artificial and quite as interesting as the converse, but one 
where inference is frequently impossible. 

IV.— Probable Errobs. 

§ 36, Let /be the frequency of any one group of any order, and let N be the total 
frequency observed. Also let <j) — //N. Then the standard deviation of ^ or cr^ is 
at once given by 

cr^=: a/WEM (I) 

The S.D. of the frequency/ is N times this. 

§ 37. Now consider the frequencies of two groups and let us find the correlation 
between errors in their frequencies. We must here consider two different cases, 
(1) where we are dealing with two ultimate groups, e.g., (AB) {A/3), or (ABC) 
(aBC), or (ABC) (aySy) ; (2) where we are deahng with the two non-ultimate groups, 
e.g., (A) (B), or (AB) (AC), or (AB) (CD). 

Case 1. Ultimate Grouys.—h^% f\, / be the two frequencies, ^j, ^^ their i-atios to N. 
Suppose <^| to undergo an increment A<^i ; there is then a total decrement — At^^ to 
be spread over the remaining groups in proportion to their frequencies, the sum of 
the ^'s being constant and equal to unity. Therefore 

A^3 = — A<ii . T™%- . ..... . . . , (2), 

Let E^^^^ be the correlation coefficient between errors in ^^ and ^g. Then the 
above equation gives us 

^3 _ "¥% 



R<^i<^, 



or for ultimate groups 



cr 



1 



•^i' 






A V 1 _ 0j • 1 _ ^^ v^/- 

xience -ry t^iv^s 

an expression that we shall frequently require. 

2 o 2 
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I 38. Case 2. Non-ultimate Groups,' — Let A and B be the groups, to take the 
simplest case only, which is all we at present require. Then 

(A) -f (B) — N = (AB) ~ {a(S), 
or, dividing by N, say 

<^j -f- ^2 ""^ 1- =^ '^j '"^ ffg 

8<^j + S(^3 ==: h^i —' SlTg ,,,,.., . (5). 

Squaring 

o-^^ -f cr^/ + 2 cr^^ o-^^ K^^^^ = cr,^^ + cr,./ — 2a^^ cr^^ R^^^^ , , . (6), 

1 

20-^^or^^R^^^^ = ^ { (^1 + ^s) •-- (^1 ~~ ^s)^ + ^1 - ^1^ + ?^3 •- ^3^ }• 

Substitute for {rr^ — 773) in terms of the first order frequencies, and also for TTg in 
the first bracket. Then we have 

^#1^*2 ^#i#2 ^^ j^(^i "^ r its) ...»». ^ » (7), 
or for R^^^^ when these are non-ultimate groups 

Now if the attributes AB are independent ^j = ^^ ^2, so that if the attributes 
are unassociated errors in their frequencies are uncorrelated. On the other hand, 
errors in the frequencies will be perfectly correlated only if 

(AB) = (a^)=:0 

or else 

(aB) = (A^) ^ 0, 

which is more than is necessary for complete association (Q = ±1)- If the groups are 
ultimate we see from equation (3) that errors in their frequencies are always 
correlated, unless, indeed, the frequency of one of the groups be vanishingly 
Dmaii. 

§ 39. We may now proceed to find the probable errors or standard errors of k and 
Q. Let (^1, </j^, </)3, ^i be the vahies of ^ for any four groups forming a tetrad, e.g., 

let 

<^i = (ABCDE)/N (^3 = {A/8CDE)/N 

(^3 = (a^CDE)/N j>^ = (aBCDE)/N. 
Then for this tetrad k (equation (2), p. 27Z) is given by 



OK 



K. 







^3 <^4 ^1 


S^3 
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K 



9 



CTu 



fc- 



o o 



2 2 



2 



^1^3 



<!>v^4>& -^Hih 



9. 



2 



<P,^6 "^^^"^^^ "^^^^^ 



2 



<f>l^^3 ■^^'Ws 






^^^3<^<^4 ^<^3<^4 



1 



<^i 



^1 I 1 



<^H 



1 






1 



1 



^2 






<^4 



<^ 






4 



^ ll^l ^3 ^3 ^4 



cr 



K 



^^ /I I „L I i, I 

^N V 6, ^ 6. ^ (6o ^ 



T 

^4 



• «»»SS«9 



We may write the above 



^ _ (h^S^i + Ms<i>4 + ^i<^3^4 + ^l<i>2<t>s) 1 



/C^ 



H 



S — (^2^3^4 + ^l^3<^4 + ^1<^3<;&4 + ^1^2^3 ) ^2^4 

N 



0- ^ :z= 



• ^\<^b' • 



' (9)- 



Hence, if ^^ or ^3 is zero, Q = — 1, k ==: 00 , and cr^ = 00 ; if ^^ or ^4. is zero, 
Q = — 1, K = 0, and cr^ = 0. 

§ 40. To take Q next, the standard error of which can be derived at once from that 

of K^ 

1 -{■ K 1 + /C 



SQ = 



2 



cr 



3 



(1 + fcf 

4 



, S/c 



Q 



(1 — k) 



cr 



3 



A ^^ K * ' * • ' » * ' « • » 



Transform by substituting Q for k 



cr 



Q 






(10). 



11 



This again becomes apparently infinite if one of the ^'s vanishes, but 

whichever of the ^'s is zero. So that the probable error of the association coefficient, 
like that of the coefficient of correlation, vanishes at the limiting values ± 1- 
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In the case of equality of contraries we may express the standard error of Q as a 
function of Q only {vide equation 9, p. 27b), Viz., 






K 



(12), 



The standard error of the correlation coefficient is simply (1 — r^)/^/!^^ so the S.D, 
of Q is the greater (for equal numerical values of Q and r) by the fraction on the 
right. The value of this fraction is given below :™ 

Ratio of Standard Error of Q to Standard Error of r 
{for equal numerical values of Q and r)« 



Q. 


Ratio, 


Q^ 


Ratio. 




1 


1-001 


•6 


1-061 




•2 


1-005 


•7 


1-095 




-3 


1-012 


^8 


1-155 




•4 


1-023 


•9 


1-283 




•5 


1-038 


1-0 


1-000 





For corresponding values of Q and r, however, the probable error of Q is less, not 
greater, than that of r, ie., if we form Q and r for the same material the prob- 
able error of the former constant is the smallest. The table on p. 276, § 26, gives 
corresponding values of the two coefficients, and these are repeated below with their 
probable errors :~'^ 



Q. 


\/N X probable 
error. 


Value of r 

corresponding 

to Q, 

•079 
^158 
•239 
•322 
•409 
•500 
•598 
-707 
-833 


x/N X probable 
error, 

•670 
-657 
•636 
•605 
-562 
•506 
^441 
'337 
•206 


•1 

'2 
-3 
'4 
•5 

-6 
'7 
•8 
•9 


•668 i 

•651 

•621 

•580 

•525 

•458 

•377 

•280 

•164 



In determining the value of the probable error of Q we have, however, implicitly 
assumed that the dividing points between A and a, &c., were fixed and not liable to 

* In both these tables the value used for the probable error of r corresponds to the determination of r 
by the product-sum method. By any other method, e.g.^ Mr. W. F, Sheppard^Sj the probable error is 
greater, and this would increase the divergence between Q and r^ as regards reliability, in the last table. 
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error. If the dividing points be taken to be the means, this is not so, and the 
probable error of Q would be increased. 

§ 41. The standard error of the surplus ratio comes very simply 

2 ,^--,-3 ^ 1 



^.i — ^ v^*^!^ ■- ^) == ^^ yi - V (13), 

so that the probable error of s^ is the smaller the larger s^ . ., 

§ 42. It remains to determine the correlations between errors in surplus ratios 
and between errors in surplus ratio and errors in association. The first problem 
proceeds exactly as in the case of finding the correlation between errors in two non- 
ultimate groups (p. 284, equations (5)~-(8)). 

(A) + (B) - N ==: (AB) ^ (a/3) 
or say 



^1 + <^2 "- 1 == ^1 



TT 



3 



8^1 + S'^3 = 2 {8iTi — §773) 
where s^ 5^ are the surplus rations of A and B. Proceeding as in the previous case 

4 
<rs,<rs,K,s, = ^j^ ('^1 - ^1(^2) (14). 

Whence 



'"'"^'h 03(1 - i>d 



<Ts, TTi -• <^i(3f)2 



y 



(16). 



^''^<rs^ 0i(l-0i)J 



These regressions are positive if A and B be positively associated. Thus if A be, for 
example, genius in father, B genius in son, and, if in a sample of the population there 
be found to be a surplus of genius differing from the average by Ss-^^, then we should 
expect to find in the sons of the sample a surplus .^2 + S%? where 



0^2 "— l^SiSo • ^'^'i' 



^^1 



43. To proceed to find the correlation of errors in Q^^ and, say, Sy 

2k Sfc 



(1 + kY k 
2k 



(1 + Kf 



S^g S^4 S(;6i l^. 



02 04 01 03 . 



> • 
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If 

/ _ (^) „ (AB) + (AB) _ 



SQij . Ss^ 



1 + /c^ 









H 



4:fC 1 

<^Qi. • <^^^i ^ l^Qi. • ^i = (XT^2^{(-'- ^ ^5) -- (1 - ^1) - ^^ - c/>i — 4 + ^1 + ^S + ^^J 



0. 



Therefore 



Q12 • % """^ v7 » , 8 9 . , , , . . , {^L/j, 



that is to say, there is no correlation between errors in association and errors in 
surplus. Although we were to select out of the whole population a particular group 
with an abnormally large surplus ratio for any one attribute, we would not expect 
any definite divergence from the normal in tlie associations of that attribute obserA^ed 
within the group. 

Of course all the expressions we have given above are for standard errors ; the 
values of the probable errors will be obtained by multiplying them by the constant 
•674489. . . . 

V. '—Illustrations. 

A, Miscellaneous. 

(1.) Small-pox attack rate and vaccination, 

(2.) Examples from Mr. Galton's '' Natural Inheritance'' :— 

Assortative mating according to temper, 

"Association of temper in fraternities. 

Inheritance of artistic faculty. 

Assortative mating according to stature. 
(3.) Examples from Dakwin's '' Cross and Self Fertilisation '' :— 

Cross fertilisation of parentage and tallness of offspring. 

Pure self fertilisation and crossing of flowers on same plant, 

(A). Miscellaneous. 

§ 44.— (1.) Small~pox and vaccination. 

At the very commencement of this paper death-rates, with and without the adminis- 
tration of an antitoxin, w^ere suggested as affording suitable examples of association.'' 
Death-rates by small-pox amongst the vaccinated and unvaccinated would form such 



OF ATTRIBUTES IN STATISTICS. 



289 



an instance, but the figures I found most suitable to my purpose are attack-rates — not 
death-rates. The following table gives the (percentage) small-pox attack rate, in 
houses actually invaded hy small-pox, of persons under and over 10 years of age, in 
five towns in which small-pox epidemics have recently occurred."^ 



Town. 


Date. 


Attack rate under 10. 


Attack rate over 10. 


Vaccinated. 


Unvaccinated. 


Vaccinated. 


Unvaccinated. 


Sheffield ...... 

Warrington . . . . . 

Dewsbury 

Leicester 

Gloucester ..... 


1887-88 
1892-93 
1891-92 
1892-93 
1895-96 


7-9 

4-4 

10-2 

2-5 

8^8 


67-6 
54-5 
50-8 
35^3 
46-3 


28-3 
29-9 
27-7 
22-2 
32-2 


53-6 
57-6 
53-4 
47-0 
50-0 



From these data we can work out the association between " lack of vaccination " and 
" attack,'' for children and persons over 10 years of age. If we call attack A, '' non- 
vaccination" B, the data given are 100 (A/3)/(/8) and 100 (AB)/(B) ; subtracting each 
percentage from 100, we get 100(a^)/(/8) and 100 (aB)/(B). Thus for the coefficient 
of association in Sheffield for children we have 



Q 



«- ^'^•^ X 92-1 >- 32-4 x^^O 
"^^ 67-6 X 92-1 + 32-4 x 7'9 
= -92 



where we have divided through numerator and denominator of the ordinary expres- 
sion for Q by (B)(/3), leaving its value unaltered. This seems rather an interesting 
case, as the form in which the data are presented does not give the surplus ratio for 
non-vaccination, ^.6., the ratio of non- vaccination to vaccinated, but does give the 
association coefficient Q. The whole series of values are given below, and form a 
striking addition to the previous table. The association between non-vaccination 
and attack is very high indeed for young children— '8 to '9 — but drops sharply to 

Association between Non-vaccination and Attack in Infected Households. 



Town. 


Children under 10. 


Persons over 10. 


Sheffield 

Warrington 

Dewsbury 

Leicester 

Gloucester 


•92 
•93 
•80 
•91 
•80 


•49 
•52 
•50 
•51 
•36 



* I have taken'the table from Mr. Noel A. Humphrey's paper, "Vaccination and Small-Pox Statistics," 
* Journal Koyal Statistical Soc.,' vol. 60 (1897), p. 525. It is quoted by him from the * Final Eeport of 
the Vaccination Commission/ p. 65. 
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"5 (owing presumably to the waning protection of the vaccination made in infancy) 
in the older age group. 

The constancy of the association in towns with widely different attack rates is 
a point worthy of notice. Sheffield, Warrington, and Leicester exhibit practically 
identical associations, although the attack rates vary from 7'9 to 2*5 and 67*6 to 35'3. 
Not having the original figures for these cases I cannot state the probable errors. 

§ 45. — (2.) From Mr. Galton's ''Natural Inheritance." 

Assortative Mating according to Tenijoer. — On p. 231 of '^ Natural Iniieritance " 
Mr, GALTOisr gives the data, based on 111 marriages : — - 

Good- tempered husbands with bad- tempered wives . . 24 per cent. 

Bad-tempered ,, good-tempered ,, , . 31 

Good-tempered ,, ,, ,, , , 22 

Bad-tempered ,, bad-tempered ,, . . 23 



n 



5? 



Here 



.^ 22 X 23 "— 24 X 31 ^ 

"^ 22 X 23 + 24 X 31 



for the association between temper in husband and wife, ^.e., on the whole bad- 
tempered husbands have good-tempered wives, and vice versa. But the probable 
error of the association = 



...,. 1 - -0361 /lOO /I , 1 . 1 , 1 
•6^45 — ^— V m V 22 + 2l + 24 + 31 =■■ -^24 

SO that only very slight stress can be laid on the sign of the association. 

The advantage of having the whole question of the association thus compressed 
into one figure, with a definite probable error, is here very clearly marked. Com- 
23aring the actual figures with the distribution in the case of no association, 
Mr. Galton concluded '^that the figures taken from observation run as closely 
with those derived through calculation as could be expected from the small number 
of observations." ^ 

§ 46. Association of TemjJer in Fraternities,-— On p. 235 of ''Natural Inheritance" 
Mr. Galton gives, in the same investigation, data for the association of temper in a 
fraternity (group of brothers and sisters). Thus in sixty-six fraternities of three 
members there were eleven cases reported in which all were good-tempered ; fifteen 
in Avhich one was good and two bad ; twenty-one in which one was bad and two good ; 
and eight in which all were bad. From data of this kind 1 formed all the possible 
pairs (permutations). Thus in the above case, using G for good, y for bad, I find the 
number of pairs as below : — 

^ This conclusion is in part affected by a slight error (owing to accumulation) in the " non»associated " 
figures given. The (constant) difference between the observed and " non-associated " frequencies is 2 per 
cent, to the nearest unit. 
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i-i *J i. 



Number of fraternities of tbree. 


Giving pairs 

GrCx. 


Giving pairs 
Gy. 


Giving pairs 
yG. 


Giving pairs 

77 


All good 11 

2 good, 1 bad 15 

1 good, 2 bad 21 

All bad ....... 8 


66 
30 

• • « 


• • • 

30 

42 

• « « 


• « • 

30 
42 


ft 1 • 
• • • 

42 
48 


Total 


96 


72 


72 


90 



Getting out the number of pairs in the same way for fraternities of all the sizes 
given, I find the totals : — 



Good-good . . . . . 
Good-bad and bad-good . 
x)ciu."" oaci. • « » , , . 



330 pairs. 
255 „ each. 
454 



?? 



Q = 



330 X 454 - 255 x 255 
330 X 454 + 255 x 255 



This value seems rather low; the fraternal correlation of -4 should correspond to 
an association of about '49, or more as the axes of division are not taken through 
the medians. 

§47. Inheritance o/ -iri^isi^^c i^'acw% (" Natural Inheritance/' p. 218).— The data 



are :- 



Number of artistic children with artistic parentage . . . 

5, ,, non-artistic parentage . 

non-artistic children with artistic „ . . 
,, ,5 non-artistic parentage 



>? 



?) 



296 
173 

372 
666 



We have called the parentage artistic where either parent is so entered. Mr. 
Galton's table does not separate the sexes. The above is consequently a kind of 
'' mid-parentage " inheritance table. The association coefficient is 

Q = -508 ± -029. 

The correlation of ofispring with mid-parent is *42, corresponding to an association 
of -51— or remarkably close to the above ; but the close agreement must be more or 
less accidental. As we have seen (p. 276, and example), shifting the axes of division 
of a correlation surface towards the extremities of the surface, so as to increase or 
decrease the surplus ratio of each attribute, increases the association between them. 
This ought to have made the association somewhat higher in the present case, as 
there are only 469 artistic children to 1038 non-artistic, or .^ = '378. It is interesting 

2 p 2 
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to note that apparently, from Mr. Galton's figures, there is a negative correlation 
between the artistic character of parentage and fertility ; thus : — 

When both parents are artistic, number of children to fraternity is 4*93 
,, one parent is ,, ,, „ ,, 5*15 

,, neitner ,, ,, ,, ,, ,, o Zio 

This would not, however, affect the association between artistic faculty of parentage 
and offspring, as increasing or decreasing the frequencies 296 and 372 in any con- 
stant ratio would not alter the ratio of the cross-products. The interest lies in the 
fact that artistic faculty is apparently a heritable attribute associated (negatively) 
with fertility, and hence (as Professor Peahson has pointed out) would tend to dis- 
appear in the absence of opposing causes. 

I 48. Assortative Mating according to Stature,—! give this example as an instance 
of the fact that the association between attributes depends very largely^ in some 
cases, on what I have called the choice of axes, i.e,^ the strictness of definition of the 
attribute. The following are the data giving the number of observed cases in which 
a tall, medium, or short husband was mated with a tall, medium, or short wife 
{^' Natural Inheritance,'^ p. 206). 









Wife. 






Tall. 


Medium. 


Short. 


m 


'Tall 

Medium .... 
Short . . ... 


18 
20 
12 


28 
51 

25 


14 

28 
9 



Now, if we take the dividing point between the " tall" or ^^ fairly tall," and ^^ fairly 
short " or '' short " to be (1) between tall and medium, (2) between medium and short, 
we get the following data : — 





(1) 


(2) 


Tall husband and tall wife .... 

„ „ ,, short wife . . . 

Short ,, „ tall „ ... 

,5 ,, ,, siiort 5, ... 


18 

42 

32 

113 


117 
42 
37 

9 




205 


205 
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In the first case 

Q = + -20 ± -11 
In the second 

Q = - -19 ± -13. 

Thus the first case gives a positive, the second a negative association between 
stature of husband and stature of wife, in both cases the value of Q is greater than 
its probable error, and the difference between the two Q's is more than twice the 
probable error of the difference. 

I think the above change of sign implies that while tallness in husband is associated 
with tallness in wife, (extreme) shortness is not associated with (extreme) shortness. 
Thus 30 per cent, of the tall husbands have tall wives, but only 20 per cent, of the 
short husbands have short wives ; 36 per cent, of the tall wives have tall husbands, 
but only 18 per cent, of the short wives have short husbands. While it appears at 
first sight an unsatisfactory characteristic of association that its sign may depend on 
the axes chosen, I believe that this is not really the case. On the contrary, such 
changes of sign may call attention to important physical realities, masked by the 
application (possibly) of the '' rectilinear '' theory of correlation, to cases where it 
gives a result of a somewhat crudely average character. Where only one average 
sort of result is similarly desired for the association I think the lines of division 
between A and a, and so on, should be taken through the means or medians. 

§ 49.^ — (3.) Darwin's '' Cross and Self Fertilisation of Plants." 

The attributes, the association of which is here discussed, are '* crossing of 
parentage " and '' tallness of height " in plants. Thus, the one attribute is really 
invariable ; the parentage must be either crossed or self-fertilised — since asexual 
propagation is excluded. The other attribute — height— is, however, really variable, 
and hence, in accordance with the preceding remarks, the point of division between 
tallness and shortness is best taken at the mean. 

In many of the species the number experimented on by Darwin is too small to 
give any reliable coefficient of association. I have therefore picked out only a few 
of the species for which most data were available and investigated them, to see 
whether there were any reliable differences between the associations observed for 
different species, and as a rough ground for comparison I have pooled together the 
results for the thirty-eight different species for which there were sufficient data, and 
worked out the association for the total. The data are given in the table below. 

The '' average height " referred to is the average height of cross and self- fertilised 
plants taken all together ; if their numbers were unequal the cross and self- fertilised 
were averaged separately, and the mean of the two averages taken. Different genera- 
tions are also all pooled together for each species, but each of the tables in the book 
(different generations or experiments) was averaged separately, and the heights in it 
referred to its own averages."^ 

* For other details I must refer to the book itself. Thus, in some tables a pot of " crowded plants," in 
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The associations I find for the whole mass and for the five species chosen are 

Number of plants. 



Whole series 


, . 1094 


Q 


= + -66 4- "025 


Ipomea purpurea . . . 


, . 146 




= + -90 + -028 


Petunia violacea . . . 


, . 154 




= 4- -90 + -026 


Reseda lutea . . . . 


64 




= + 74 + -086 


Reseda odorata . . . . 


110 




— + -49 ± -103 


Lobelia fulgens . . . . 


68 




— 4- -29 ± -153 



Are these differences significant ? Taking successive differences down the table 
from Petunia violacea onwards I find 

Probable error 
Difference. of difference. 

Petunia violacea and Reseda lutea . . . '16 '090 

Reseda lutea and Reseda odorata ... '25 '134 

Reseda odorata mid Lobelia fulgens, . , '20 184 

and for the extreme difference 

Petunia violacea and Lohelia fulgens . . *61 '155 

These figures can leave no donbt, I think, that specific differences do exist as 
regards closeness of association between crossing and vigour of offspring—even in 
species all normally cross fertilised. The difference between Ipomea purpurea or 
Petunia violacea and Lohelia fulgens is certainly significant, and not only so but each 
successive difference in the above short table is greater than its probable error. It 
must be remembered that we are dealing with a different point to that noted by 
Darwin; he is dealing with the amount of the difference between crossed and self 
fertihsed offspring ; we are measuring the approach towards absoluteness of the law 
that there is a constant difference. The law is much less absolute— permits of many 
more individual exceptions— with some species than with others. 

A curious point is the significant difference between the wild and cultivated species 
of Reseda. The difference may possibly be due to the cultivated character of 
R. odorata, but certainly need not be, as the two first species on the list in which the 
association between height and crossing is '9 are both cultivated species foreign 
to England. Reseda odorata was also erratic in its behaviour as regards self sterility 
{ef. '' Cross and Self Fertilisation/' p. 119 and pp.336~^9), some plants being highly self 
fertile, others quite self sterile. The offspring of highly and slightly self fertile plants 
were, however, equally vigorous. 

Tn the table on p. 295 I have entered the sign of the association in the column on the 
left ; as I have stated, the probable errors are so large that it seems misleading to give 

which only the tallest of each lot was measured, is included. In other cases I have pooled outdoor-grown 
plants with plants in pots, taking the average height separately as above, and so on. 
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Association between Cross Fertilisation and Tallness of Height (Charles Darwin, 

"Cross and Self Fertilisation"). 





Sign of 

• 


Total 


Cross fertilised. 


Self fertilised. 


Species. 


Above 


Below 


Above 


Below 


associa- 
tion. 


observed. 


average 


average 


average 


average 






of their 


of their 


of their 


of their 








generation 


generation 


generation 


generation 








or series. 


or series. 


or series. 


or series. 


1. Ipomea ]()ur][}m'ea . . . 


+ 


146 


63 


10 


18 


55 


2. Digitalis purpurea , . 


+ 


24 


12 


4 


2 


6 


3. Verbascum thapsits . . 


+ 


12 


4 


2 


3 


3 


4. Gemieria pendulina . . 


+ 


16 


5 


3 


3 


5 


5. Salvia coccinea .... 


-h 


12 


4 


2 


2 


4 


6. Origanum vulgar e . . . 


o 


8 


2 


2 


2 


2 


7. Brassica oleracea . . . 


__ 


18 


4 


5 


5 


4 


8. Iberis umhellata , , . 


+ 


14 


6 


1 


3 


4 


9. Fapaver vagum . . . 


-f 


30 


9 


6 


6 


9 


10. Eschscholzia calif ornica . 


+ 


8 


3 


1 


1 


3 


11. Beseda lutea .... 


4- 


64 


25 


7 


11 


21 


12. Beseda odorata . . . . 


+ 


110 


39 


16 


25 


30 


13. Viola tricolor .... 


+ 


28 ! 


12 


2 


1 


13 


14. Delphinium consolida . . 


+ 


12 


3 


3 


2 


4 


15. Viscaria oculata . . . 





30 


8 


7 


8 


7 


16. Dianthus caryophyllus , 


4. 


16 


5 


3 


4 


4 


17. Hibiscus africanus . . 


— 


8 


2 


2 


3 


1 


18. Pelargonium zonale . . 


+ 


14 


4 


3 


3 


4 


19. Tropceolum minus . . 


+ 


16 


6 


2 


2 


6 


20. Limnanthes douglasii . . 


4' 


31 


12 


4 


4 


11 


21. Lufinus luteus .... 


+ 


16 


8 





2 


6 


22. Plmseolus muUiflorus . . 





10 


3 


2 


3 


2 


23. Lathyrus odoratus . , . 


+ 


16 


5 


3 


2 


6 


24. Clarlcia elegans . . . 


+ 


8 


3 


1 


1 


3 


25. Bartonia aurea ... 


_. 


16 


3 


5 


4 


4 


26. Scabiosa atropitrjmrea . \ 


+ 


8 


2 


2 


1 


3 


27. Lactuca sativa .... 


1 


13 


3 


4 


3 






28. Specular ia speculum . . ! 


-1- i 


8 


2 


2 


1 


3 


29. Lobelia ramosa. . . , i 


+ \ 


14 


5 


2 


1 


6 


30. Lobelia fulgens {2nd gen.) 


■A--' ! 


68 


17 


17 


12 


22 


31. Nemophila insignis . . 


+ \ 


22 


11 


1 


2 


8 


32. Borago officinalis , . . 


+ \ 


8 


2 


2 


1 


3 


33. Nolana prostrata . . . | 


[ 


10 


2 


3 ^ 


3 


2 


34. Petunia violacea . . , 


+ 1 


154 


61 


16 


13 


64 


35. Nicotiana tabacum . . \ 


-- 


34 


7 


10 


8 


9 


36. Beta vulgaris . , , .\ 


+ i 


16 


6 


2 


4 


4 


37. Zea mays .....! 


+ : 


30 


12 


3 


3 


12 


38. Phalaris canariensis . . 


-f 


46 


15 


8 


7 


16 


Totals .... 

1 


i 

i 

+ 1 


1094 


395 


168 

1 
1 


179 


372 



the amounts in all cases. In six cases of the thii^ty-eight the sign is negative, and 
in three cases the association is zero. 

§ 50. There is another interesting point of the same investigation in which the 
present metliod will enable us to state clearly the quantitative result and its probable 
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error. This is the question whether fertiHsation of a flower with pollen from another 
flower on the same plant is any better than strict self-fertilisation. Darwin came to 
the conclusion that in only one of the five species tried— Digitalis purpurea— was 
there any sensible advantage in crossing different flowers. The point was a difficulty, 
as there should be an advantage, though a slight one, on his theory that the benefit 
of cross-fertilisation arises from differences in the general constitution of flowers 
crossed. 

In only three of the five species tried are the numerical data given in the form 
required for the present method. They run as below :— 





Ipomea 
purpurea. 


Species, 




Total. 


Mirmilus 
lufeus. 


Digitalis 
purpurea. 

XXIV. 

19 

6 

11 

14 


Table in book . . . \ . 

oD r Above average . . . 
o J 

^ L Below average . . . 

^ r Above average . . . 

[Below average . . . 


XII. 

17 
14 

22 
9 


XXI., XXll. 
21 

■ 

16 
16 
21 


57 
36 
49 
44 


Total .... 


62 


74 
+ -27 ± -147 


50 
+ -60 ± -133 


186 


Association coefficient Q . . 


- -34 ± -160 
1 -61 


•174 ± -097 


1 

Difference of Q's and prob- 
able error of difference 


± '217 -33 


± -198 





Thus, in the case of Ipomea purpurea, the association is distinctly negative- 
crossing with another flower of the plant was worse than pure self-fertilisation — but 
in both the other cases it is positive, Mimulus, it is true, offers somewhat doubtful 
evidence (two experiments having given conflicting results), but the coefficient of 
association is almost twice its probable error. The foxglove certainly exhibits far the 
highest and most significant association. Two out of the three species give a positive 
result, and if all the species are pooled together the result is positive. The fact is we 
are looking, in all probability, for a very small association, and extensive experiments 
may be necessary to render its existence certain. Taking the above results together, 
I should certainly say they gave evidence on the whole of a positive association. The 
odds against the negative association in the case of Ipomea occurring as a purely 
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chance deviation from the positive would be, roughly, twenty-one to four, or say five 
to one only — not overwhelming odds by any means. 

As in the general case of complete crossing versus self-fertilisation, the difierences 
between species are, however, almost certainly significant. It may be true that 
'' crossing of different flowers on the same plant is always, on the average, better than 
pure self-fertilisation,'' but the closeness with which the law holds good will vary in 
different species. 

VI. — Illustrations — ^continued, 
(B.) The Association of Defects in Children and Adults. 

§ 51. The material on which the following investigation is based is drawn almost 
wholly from the '^ Report on the Scientific Study of the Mental and Physical con- 
ditions of Childhood,"'^ issued by a committee with representatives from the 
British Medical Association, the British Association, the Charity Organization 
Society, &c. Before 1892 the same work was in the hands of other committees of 
those bodies, and in 1897, a '' Childhood Society '' was formed to carry it on. Two 
series of investigations have been made under these committees, the first from 
1888-91, the second from 1892-94. As I understand, the whole of the observations 
in the first period, and the great majority of them in the second, have been made by 
Dr. Francis Warneh. 

For the complete description of the method of observation, &c., I must refer to the 
report itself. A very large number of schools were visited (Board Schools, Poor 
Law Schools, Voluntary Schools, &c., most, but not all, in London), and the children in 
them examined individually for the presence or absence of certain defects, of which 
the main classes were (using Dr. Warner's notation).! 

A. Defects in development of the body or its parts ; in size, form, or proportioning 

of parts. 

B. Abnormal ne?Te signs ; certain abnormal actions, movements, and balances. 

C. Loiv nutrition, as indicated by the child being thin, pale, or delicate. 

D. Mental dulness. The teachers' report as to mental ability was added to the 

record of each child registered, and those stated to be below the average in 
ability for school work were registered as '' Dull." 

These main classes of defects observed were the same in both investigations, in 
each of which 50,000 children were observed. The returns given are, however, 
somewhat more detailed for the later investigation, the material being sub-divided, 
for instance, into school standards. I have used material from both. 

§ 52. The whole of this mass of observations was made, as stated, on children not 

"^ Published by tbe Committee, Parkes Museum, Margaret Street. Price 2s. M. 

t Report; pp. 12-13. A fuller description of the signs is given on pp. 13-16 and pp. 72 et seq. 
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older than fourteen years or so. For purposes of comparison, and in order to be able 
to follow the association of defects from childhood to old age, I have made some 
use of the material available in the Census. "^^ This consists of the numbers of those 
who are blind, deaf and dumb, or mentally deranged, and. the numbers of those 
suffering from combinations of these defects ; the sexes are separated and the 
numbers indifferent age groups are given. The figures are very unreliable In the 
early age groups (0 — 5, 5 — ^10, 10 — 15), especially for '^mental derangement," f but 
so far as I know they are the only figures of the kind published. I have not used 
the age group — 5 at all, and the others appear to give much the sort of associa- 
tions one would expect, quite comparable with those given by the results of 
Dr. Warner's investigations. 

§ 53. The case to be treated is by no means a simple one. Suppose a certain 
group of young children to be observed at some time and the frequencies of all com- 
binations of certain defects noted ; let the survivors be again observed after a lapse 
of years, and the defects be again noted. Changes in the relative frequencies and 
the associations observed between defects will have taken place for three reasons— 

( 1 ) Because some of those originally observed have died ; 

(2) Because some have outgrown or lost certain defects ; 

(3) Because others have acquired defects. 

We have taken the case as referring to children, but the first and third causes of 
change are equally, or more, effective in the case of adults. 

Now if the observations were made as supposed, and a record kept of the same 
individuals, the effect of each change could be distinguished. Those who had either 
lost or acquired defects^ during the intervening period could be struck out of both 
series ; the resulting changes would be due to selection only, and so on. But un- 
fortunately this is not the case in any published statistics of which I am aware, even 
in the '' Childhood Committee's " work, where such a procedure might well have 
been adopted. All that is given is that a certain group, closely centered round a 

•^ Census of 1891, vol. 3, Tables 15, 16, p. Ivii. ; 1881, vol. 3, Tables 14, 15, p. xlv. In the Census of 
1871 the numbers with combinations of defects are not given, so the material is not available for the 
present purpose. In the Census of 1881 those who are ''Idiot or Imbecile" are distinguished from the 
'' Insane,"|both in the first and second order groups, and I have made some use of this {cf. below, p. 312). 
In 1891 those " Mentally Deranged from Childhood" and "Blind from Childhood" are distinguished from, 
others, but the same distinction is not continued where these defects are combined with others. The 
notation of the Census is not clear : by the courtesy of the Kegistrar-General I am informed that 
''Blindi'' incUdes ''Blind and Dumb," and so on; but "Blind and Dumb" does not include "Blind, 
Dumb, and Mentally Deranged," i.e., the frequencies given are (A) (B) (C) (ABy) (A/3C) (aBC) (ABC). 
This should be made clear in a future Census. 

t Cf. remarks in the Census Reports for 1881 : General Eeport, p. 68, 

X I speak of " defects" as they form our subject matter, but the reasoning applies to attributes of any 
kind. I would define a defect, tentatively, as an attribute the possessors of which have a death-rate above 
normal, but I do not hnow that this applies to all, e.g., of the defects noted by Dr. WARNER. 
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certain age, has certain percentages of defectives and exhibits certain associations 
between defects. Another group, of another mean age, has different percentages of 
defects and different associations. Such differences are due to all three of the above 
causes of change acting together, superposed in general on wholly unknown initial 
differences between the groups, due either to their being differently gathered 
samples, or to a secular change taking place in the population. It becomes, then, 
impossible to separate, except conjecturally and in a more or less tentative manner, 
the effects of selection from those of growth-— mcludiiig under this term all processes 
of change that take place in the individual. 

§ 54. Let (AB)j, (A/3)i, {a/3)^, {^^)i be the four second-order frequencies observed for 
any pair of defects for an early age, and let (AB)^, (A^)2, {ocfi)p {0^)2 be the frequencies 
at a later age in the same individuals, or in an older age group of the same population. 
Then if Q^ Q.^ be the association coefficients for the two groups, k^ /c^ the values of 
the corresponding ratio of the cross products 

Ms < Ml 

if K.2 > Ki 

or &l(^^ ^ (^B), (A/3), 



Let 



Then 



so long as 



{aB\ (A^X (AB), (a^y 

(AB), '1 (A/3X '' i-^),'^ ' (-BX"^^^- 

Q-2 < Qi 



^1 % ^ % ^4.- 



If we are dealing with a population subjected solely to selection, Si .% ^3 .% are the 
survival rates for the four classes ; the quantities sjsj, sjs^, sjs^ we may call the 
" survival figures " for the three classes, and the condition may be written 

That is, if the survival figure for the class with two defects be less than the product 
of the survival figures for the singly defective classes, the association will decrease. 
A priori I could not say whether this condition should hold or not ; it appears 
possible that selection might either decrease or increase association. Practically the 
condition does seem to hold,^ but, as mentioned above, the evidence is not complete 
nor certain, for we cannot, amongst present data, find a single case in which change 
is certainly due to selection alone without other causes. 

When we are comparing one age group with another of the same population the 

* Below p. 312-315. 
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SB are no longer, of course, simple survival rates, and may be greater than unity. 
This is notably the case for instance with such defects as increase rapidly in old age, 
e.g., blindness or mental derangement. 

§ 55. Let us turn now to the consideration of Dr. Warner's materials. In Table I. 
below I give the associations observed between the six possible pairs of defects in 
the two successive investigations, together with the probable errors of the associa- 
tion coefficients. At the bottom of the table are given the percentages of the 
children, observed with given defects A, B, C, or D, and with any defect or combination 
of defects. "^ The associations observed are on the whole very markedly lower in 
the second investigation, and the percentages of children with given defects are 
smaller (except for dulness), but owing to the lower association the total percentage 
of defective children has risen, in the case of the girls, from 14 '3 to 14*8 per cent. 

Table I. — Comparison of the Association Coefficients in the Investigations of 1888- 91 
and 1892-94. A. Development Defects, B. Nerve Signs. C, Low Nutrition. 
D. Mental Dulness. 







Boys. 


G-irls, 


AB ...... 


1888-91. 


1892-94, 


1888-91. 


1892-94. 


•898 ± -003 


•750 ± -005 


•904 ± ^003 


•784 ± -008 




AC] 


•903 ± -004 


•848 ± -007 


•952 ± ^002 


1 

•916 ± ^004 


AD 


•893 ± -003 


•846 ± •COS 


•929 ± •OOS 


•900 ± ^004 




BC| 


•862 ± -006 


•783 ± •OlO 


•914 ± ^004 


•814 ± -009 




BD 


•893 ± ^003 


•897 ± '003 


•926 ± ^003 


•905 ± ^004 


CD 


•791 ± -009 


•823 ± -008 


•863 ± -006 


•835 ± ^008 




Percentage of 

children 

with defect 


'A . . 


13^5 


9^6 


9^6 


6-8 


B . . 
. . 


12^6 


11-9 


9-0 


8-5 


3-8 


3^1 


4-4 


3^2 


D . . 


8^2 


8-7 

i 


6-3 


6-9 


With any defect. . . 


19^9 


18-2 


14^3 


14^8 



The associations are, however, all high (very high compared with most coefficients 
of organic correlation with which one has to deal), ranging from '784 to '952. 



■K- 



I.e., the last figure is 100 (1 - (ay^yS)/N). 
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§ 56. The differences between the associations cannot probably be, at present at all 
events, assigned to any definite cause or any definite difference in the material 
observed. We would expect to find differences of association in different groups, just 
as we find differences between the correlation coefficient for different local races, 
without our being able to say with certainty how these differences have come about. 
In many cases the differences are certainly real, as they are large compared with the 
probable errors of the differences — three to eight times the probable errors. That 
considerable differences exist between different classes of schools, not only in the 
proportions of defective children but in the associations between defects, is shown by 
Table 11. (based on Table 19 of the Report), and consequently the divergence between 
the results of the two investigations may be due to the different classes of schools 
observed. Such differences between schools may in their turn be partly or wholly 
due to differences of age or nationality between the children. The effects of age we 
deal with below (pp. 309, et seq.) ; the differences between nationalities^ are illustrated 
by Table TIL (based on Tables 27, 28 of the Heport), showing the associations for 
English, Jews, and Irish : — 

§ 57. These two tables suggested to me at first sight an apparent law — that asso- 
ciations were on the whole higher where populations were healthier or less defective. 
If we take Table II., 4 of the 6 associations in Poor Law schools are greater than 
those in Industrial schools ; 4 of the associations in Homes and Orphanages are 
greater than those of the Poor Law schools, equality subsisting in one of the remain- 
ing cases ; and 5 of the associations in Elementary schools are greater than those in 
Homes and Orphanages, taking the schools in order of healthiness. The case is not so 
marked for girls, and we must note that for them the Homes and Orphanages are 
more defective than Poor Law schools. In Table III. for the boys, Jews are more 
defective than English, and Irish than Jews ; the Jews are less associated than the 
English in 4 cases of the 6, and the Irish less associated than the Jews in 4 cases out 
of 6 also. But again the case breaks down for girls. English girls are more 
defective than Jewish girls, but their associations are less in just 3 out of the 6 cases; 
Irish girls are more defective than English, but their associations are actually greater 
in the majority of cases. 

If we compare in this way only those cases that are adjacent in the order of 
defectiveness we get 

Table II. — 

Boys. Girls. 

Associations greater where defectiveness less in 12 cases, in 9 cases. 
„ less or equal „ „ „ 6 ,, 9 ,, 

Table III. — 



Associations greater where defectiveness less in 8 ,, 5 

less or equal „ ,, ,, 4 „ 7 

■^ It must be noted that they were all London children 
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Table III. — Illustrating the Variation in Associations between Defects in different 

Nationalities (1892-94). 

A. Development Defects. B. Nerve Signs. C. Low Nutrition. D. Dulness. 



Association. 



AB 



r Boys . 
\ Girls . 



AC 



J Boys . 
\ Girls . 



AD 



r Boys . 
\ Girls . 



BG 



r Boys . 

\ Girls . 



BD 



r Boys . 
\ Girls . 



CD 



r Boys . 
L Girls . 



Per cent, f Boys . 
A I Girls . 



Per cent. J Boys . 
B t Girls . 



Per cent. 
C 



Boys. 
Girls . 



Per cent. 
D 



Boys . 

Girls . 



Number J Boys . 
observed \ Girls . 



English. 



•752 ± 
•768 + 



•862 ± 
•916 + 



•849 ± 
•894 ± 



•787 ± 
•808 ± 



'904 ± 
•906 ± 



•843 ± 
•840 + 



008 
009 



008 
005 



006 
005 



Oil 
010 



004 
004 



009 
009 



8-4 
6-8 



10-2 
8-4 



2-8 
3^4 



8-0 
7-0 



20,682 
18,286 



Jews. 



•734 


± 


023 




835 


+ • 


018 




•832 


± 


•025 




914 


+ 


•015 




863 


± 


•014 




910 


± ' 


001 




'796 


+ 


•029 




822 


± " 


029 




•847 


± ' 


015 




881 


± 


•013 




•710 


+ 


•042 


•771 


± 


038 




9^2 
6-3 




1P9 






7-9 






3-0 






2^4 






8-1 






6-6 





2,631 
2,668 



Irish. 



•729 
•860 


+ 
+ 


•023 
•017 


•750 
•903 


± 
+ 


•037 
•016 


•785 
•929 


+ 
± 


•022 
•010 


•735 
•863 


± 
± 


•047 
•021 


•894 
•914 


± 
± 


•017 
•Oil 


•770 
•839 


± 
± 


•035 
•026 


11-6 
8-4 




14^8 
9-0 




CO oc 
CO CO 




8-4 
6^5 





2,171 
1,952 



Thus the statistics of the girls do not at all support the first impression given by 
the figures for boys, and the whole of Table T is directly adverse to any such 
hypothesis. In 10 cases out of the 12 of that table the associations are smaller in the 
second investigation, but in 6 cases out of 8 the proportions of defects are also 
smaller. Finally, it must be remembered that the theorem given on p. 288 in the 
section on Probable Errors does not lead us to expect a priori any correlation between 
degree of association and degree of defectiveness, and we must therefore demand 
pretty clear proof of the existence of such an empirical relation. The facts shown 
below that women are at once less defective and more highly associated than men, 
and that partial coefficients of association in undefective universes are higher than 
total coefficients, and much higher than partial coefficients in defective universes 
(Tables IV. and V, pp. 306-307), may be said to bring some support to such a hypo- 
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thesis^ but other explanations are here possible. The fact that the association 
decreases throughout life, as far as we can judge from present material, while 
defectiveness also decreases in later childhood {of. below, p. 310) is against it. 
Thus I do not think we can accept the hypothesis without wider evidence ; I have 
mentioned it as it occurred to me, and would probably occur to others, as covering 
certain of the facts presented. 

§ 58. The foregoing figures of Tables I. -III. show that all the defects dealt with by 
Dr. Warner are associated to a high degree, though this degree varies somewhat in 
different groups of material. The question now arises, can we investigate further 
the nature of the association between A and B or B and C 1 Suppose the hypothesis 
to be put forward, for example, that low nutrition was the cause of both defects in 
development and nerve signs, and that we only found the latter occurring together 
because they were both generally present in cases of low nutrition ; could this 
hypothesis be tested ? It could be proved at once by forming the partial coefficient 
lABjyl. If this were small the hypothesis would be confirmed, as we would be 
shown that on excluding all cases of C, A and B ceased to be associated. If, on the 
other hand, [ AB j y \ were still large, even though slightly smaller than | AB ] , the 
hypothesis could only be partially true or be a partial explanation. 

The partial coefl&cients in undefective or negative universes thus play the same 
sort of part in checking interpretations as partial coefilcients of correlation. Partial 
associations in positive or mixed universes give further information ; if, for instance, 
I AB ] C ! , [ AB I D I , I AB | CD | , &c., be all of the same order of magnitude as j AB | , it 
is evident that the presence of B continues to be a bad symptom — to render A more 
likely — even when C, D, &c., are already present. If, on the other hand, these 
associations are small, or zero within the limits of probable error, the piling up of 
symptom on symptom, or defect on defect, ceases to make the case any worse. 

§ 59. Now in the present case we have four defects to handle. These give 6 total 
coefficients; 12 first-order coefficients with negative and 12 with positive universes ; 
6 second-order coefficients with wholly negative, 12 with partially positive, and 
6 with wholly positive universes — or 54 altogether (excluding what I have called 
'' group coefficients ''). I did not think it worth while to calculate all these, and 
have confined myself to the total and second-order partial coefficients. These are 
given in Table IV. for boys and Table V. for girls. As well as working out the 
associations for children of all ages, I have divided each sex into three groups : 
Infants, Standards I. to III., and Standards IV. to Extra VII. (material in Eeport, 
Table 21). This serves for two purposes : first, to check signs, &c., as given in the 
'' all ages'' column ; secondly, to give some idea of change of association with age, a 
purpose for v/hich no other material is available in the Eeport.'^ 

* A table is given (Table 22 of the Eeport) showing the frequency of defective groups, but the number 
of undefective children (apyS) is not given, nor the number of children observed at each age. This makes 
the figures useless for discussing the associations of defects in normal children. The importance of the 
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§ 60. Comparing first the partial coefficients with negative universes, jABjyS 
&c., with the total coefficiente, we see that in every case, without exception, the 
partial coefficient is greater than the total. Hence it cannot be true that any one ot 
the defects noted is, or is even indicative of, a necessary connecting link between any 
other two. That low nutrition brings on at once development defects and nerve 
signs, or development defects and dulness, so that V7e find these pairs associated is, 
for instance, a hypothesis that may be partly true but is insufficient to explain the 
facts observed. Dr. Warner's hypothesis that '' the connecting link between defects 
of body and defective mental action is the coincident defect of brain which may be 
known by observation of ' abnormal nerve-signs ' '^ ^ seems to me equally untenable ; 
it may be so in some cases, but on the other hand the '' connecting link '' may be a 
defect of brain not indicated by abnormal nerve signs, or not a defect of brain at all. 
The demonstration of a necessary connecting link X between A and B would, it 
seems to me, only be complete when ] AB ] £ | was shown to be small compared with 
I AB I ; the demonstration that either X^ or X3 or Xg or X^ had to be present as a 
connecting link would only be complete if | AB ll^i 4 ^3 • • • f « I were shown to be 
small (zero within the limits of error). Now | AB 1 78 1 , | AC j yS8 1 , &c., are not small 
but even larger than |AB|, | AC|, &c., hence CD, BD, &c., cannot be necessary even 
as alternative connecting links or symptoms of such links in the way described 
above ; the case must be much more complex and depend on a much greater variety 
of conditions than those described by the four classes of defects noted. The follow- 
ing figures show further how little the absence of nerve signs affects the chance of 
an individual with development defects being mentally dull; on Dr. Warner's 
hypothesis (A^D)/(A/3) should be small compared with (AD)/(A) : — 



Chance of individual being dull who exhibits development defects 

= (AD)/(A) 

Ditto, but no nerve signs == (A^D)/(A^) 

Ditto, but neither nerve signs nor low nutrition = (A/3yD)l(Al3y) . . 



Boys. 


Girls. 


•385 


•449 


•341 


•411 ■ 


•329 


•414 



§ 61. Turning next to the partial coefficients with wholly or partly defective 
universes {i.e., associations in groups of which every member possesses either one or 
two defects in addition to the possible two of which the associations are considered), 
we see that the great majority are small and many even negative. The probable 
errors are, however, high, since the material is small when we are confined to those 
who are defective, so one cannot always lay great stress on the sign. In three cases 

omission is apparently unrecognised, as the last Eeport issued (1899) only gives more such figures; this 
seems to me waste of time and money. 
^ Eeport, p. 13. 
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Table IV.— Showing the Associations between Defects for Boys, for different Groups 
of Standards, and for all Ages together. (1892-94 investigation.) 

A. Development Defects. B. Nerve Signs. C. Low Nutrition. D. Mental Dulness. 



Coefficient of association. 


Infants. 


Standards J 


■-III 


Standards IV.- 
Ex. VIL 


All ages 


• 




AB 




+ '794 ± -014 


+ -743 ± 


•010 


+ ^722 ± 


•016 


+ '750 ± 


•005 




AC 




+ -900 ± -009 


+ •816 ± 


•013 


+ •802 ± 


•025 


+ -848 ± 


•007 




AD 


+ -880 ± -008 


+ ^834 + 


•007 


+ ^822 ± 


'013 


+ ^846 ± 


•005 




BC 




+ -862 ± -012 


+ •749 ± 


•016 


+ -845 ± 


•019 


+ •783 ± 


•010 




BD 




+ -928 ± -005 


+ -890 ± 


•005 


+ •886 ± 


•008 


+ ^897 ± 


•003 




GD 




+ •868 ± -Oil 


+ •792 ± 


•014 


+ -748 ± 


•034 


+ ^823 ± 


•008 




AB 


yS 




+ •859 + •oie 


+ -827 ± 


•010 


+ •790 ± 


•016 


+ ^826 ± 


•007 




AC 


B8 




+ •962 + -006 


+ •928 ± 


'009 


+ ^928 ± 


•016 


+ ^942 ± 


•005 




AD Py\ 1 


+ •947 + ^006 


+ •937 ± 


•005 


+ -933 ± 


•008 


+ ^939 ± 


•003 




BC 


aS 




+ ^948 + -010 


+ -896 ± 


'013 


+ ^935 ± 


•015 


+ ^912 ± 


•008 




BD 


ay 


***** 


+ ^973 ± -003 


+ •951 ± 


•003 


+ ^946 ± 


•005 


+ ^955 ± 


•002 




CD 


aj8 


+ -964 ± -007 


+ •935 ± 


•Oil 


+ •923 ± 


•026 


+ ^949 ± 


•006 




AB 


yD 


- -429 + -060 


-'421 ± 


'038 


- ^532 ± 


•068 


- ^443 + 


•0271 
•059/ 




AB 


C8 


-•227 ± -112 


- -335 ± 


'089 


- ^486 ± 


•126 


- ^348 ± 




AC 


iSD . . . . . 


+ •013 + •lOl 


+ •125 ± 


•088 


+ •025 ± 


•211 


+ ^096 + 


•060 1 
•053/ 




AC 


BS 


+ ^418 ± -090 


+ •116 ± 


•079 


+ ^043 ± 


•124 


+ •210 + 




AD 


/3C 


-•149 + -111 


+ •193 ± 


•101 


+ •062 ± 


•233 


+ ^076 ± 


•070 1 
•030/ 




AD 


By 


+ -055 ± -079 


+ •088 ± 


•039 


+ •018 ± 


•065 


+ -079 + 




BG «D 


-- ^066 ± -102 


- ^227 ± 


'083 


- -034 ± 


•187 


- ^201 ± 


•0571 
•054/ 




BC A8 


+ •286 ± -087 


- -077 ± 


•082 


+ •098 ± 


•133 


- ^002 ± 




BD 


ccG 


+ •251 ± -118 


+ •159 + 


•102 


+ •062 ± 


'210 


+ •140 ± 


•070 1 
•031 / 




BD 


Ay 


+ ^370 ± -066 


+ •211 ± 


•043 


+ •132 ± 


•069 


+ -226 ± 




CD 


oB 


+ •116 ± -100 


+ •ou + 


'071 


-•126 ± 


•126 


+ •077 ± 


•0501 
•049/ 




CD 


A/3 . . . . . 


+ -043 ± -085 


+ •174 ± 


•073 


- ^012 ± 


'165 


+ •160 ± 




AB 


CD 




- -223 ± ^120 


-•211 ± 


•105 


- ^263 ± 


'245 


- -233 ± 


•072 




AC 


BD 




+ ^239 + •lOO 


+ ^345 ± 


'069 


+ •335 ± 


•155 


+ ^323 ± 


•051 




AD BC 




-•137 ± -128 


+ -318 ± 


'094 


+ •312 ± 


'184 


+ •197 ± 


•044 




BC AD 




+ •164 ± ^103 


+ -003 + 


•080 


+ •281 ± 


'181 


+ -035 ± 


•058 




BD 


AC 




+ •255 ± -111 


+ -286 ± 


•097 


+ •312 ± 


•206 


+ -254 ± 


•067 




CD 


AB 




-•088 ± -113 


+ •251 ± 


•081 


+ •176 ± 


'167 


+ •197 ± 


•058 


Per cent, with A . . . 


7-9 


9-9 




7-4 




8'8 




»» ji -^ * ' • 


6-3 


13-5 




8-3 




10-9 




H »»'-''•• 


3-7 


3-1 




1-5 




2^8 




)> iy ^ • ' • 


6-7 


9^7 




5 '4 




7-9 




„ „ any defect 


14^0 


21-2 




16-3 




18-2 




Number observed . . 


7,055 


11,482 


7,168 


26,287 



* " All ages " includes those in Standard and in no Standard not included in the other columns. 
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Table V.— Showing the Associations between Defects for Girls, for different Groups 
of Standards, and for all Ages together. (1892-94 investigation.) 

A. Development Defects. B. Nerve Signs. C. Low Nutrition. D. Mental Dulness. 



Coefficient of association. 


Infants. 


Standards ] 


:.-iii. 


Standards lY.- 
Ex. VII. 


All ages."^ 




AB 




+ •850 ± 013 


+ •795 ± 


•010 


+ •747 ± ^021 


+ ^784 ± ^008 




AC 


• « • • 




+ -949 ± -005 


+ ^903 ± 


•007 


+ •881 ± ^015 


+ •916 ± ^004 




AD ... . 




+ •927 ± -006 


+ •886 + 


•006 


+ •898 ± •OIO 


+ •900 ± •OOi 




BC . . . . 




+ •866 ± -013 


+ •821 ± 


•013 


+ •821 ± ^021 


+ •814 ± ^009 




BD 


• • • • 




+ •935 ± •ooe 


+ ^908 ± 


•005 


+ •880 ± -010 


+ •905 ± ^004 




CD 


« • • • 




+ •876 ± ^012 


+ -833 ± 


•Oil 


+ •746 ± ^034 


+ •835 ± -008 




AB 


yS 


• ^ • • 




+ •896 ± ^016 


+ •881 ± 


•009 


+ ^783 ± -026 


+ -850 ± ^009 




AC 


S8 


« • • 




+ •970 ± ^004 


+ -974 ± 


•004 


+ •955 ± ^009 


+ •971 ± ^002 




AD j8y . . . 




+ -960 ± ^005 


+ •957 ± 


•004 


+ ^963 ± ^005 


+ •959 ± -003 




BO aS . . . 




+ •957 ± -009 


+ ^940 ± 


•008 


+ •897 ± •on 


+ ^927 ± -007 




BD 


ay 


• • <l 




+ •980 ± ^003 


+ •956 ± 


•003 


+ •932 ± -008 


+ •955 + -002 




CD 


acfi 


• • 




+ ^940 ± •OU 


+ •941 ± 


•010 


+ -936 ± -016 


+ •941 ± ^007 




AB 


yD . . . 




- -381 + ^077 


- ^363 ± 


•045 


- ^437 + -076 


-•394 ± ^0331 
-•352 ± •057/ 




AB 


C8 . . 




- ^200 ± -123 


~ ^432 ± 


•079 


- -175 ± -140 




AC 


/3D . . 




+ ^445 ± ^099 


+ ^394 ± 


•076 


-•172 + ^180 


+ ^325 ± ^056 1 
+ •445 ± •045/ 




AC 


B8 . . 




+ •422 + •lOS 


+ •315 + 


•071 


+ •571 + ^081 




AD 


iSC . . 




+ •308 ±-117 


+ •155 + 


•102 


- ^070 ± -199 


+ ^170 ± -068 1 
+ •257 + •035/ 




AD 


By . . 




+ ^090 + •lOl 


+ ^147 ± 


•048 


+ -433 + -072 




BC 


aD . . 




- ^061 + ^154 


+ •006 + 


•090 


- ^347 + ^149 


- -108 ± ^065 1 
+ •011 ± -055/ 




BC 


AS . . 




+ ^246 ± ^097 


- ^106 + 


•079 


+ •223 ± -126 




BD 


aC. . 




+ •320 ± ^139 


+ •156 + 


•105 


-•144 + -178 


+ •143 + -0771 
+ •211 ± •039/ 




BD 


Ay . 




+ ^418 ± ^076 


+ •127 + 


•055 


+ ^152 + ^097 




CD 


aB . . 




~ ^222 ± ^135 


+ •015 + 


•071 


-•117 + ^120 


+ -012 ± -055 1 
-•019 ± •049/ 




CD 


A^ . 




+ •115 ± -077 


- -007 ± 


•072 


- ^339 ± ^149 




AB 


CD 






- -275 ± ^185 


- ^376 ± 


•092 


- ^091 ± -268 


- ^296 ± ^072 




AC 


BD 






+ ^535 ± •lOl 


+ •381 ± 


•067 


+ ^200 ± ^209 


+ •421 ± -051 




AD BC 






+ •234 + ^155 


+ •219 ± 


•105 


+ •016 ± -232 


+ •230 ± -071 




BC AD 






+ ^058 ± ^107 


- •oio ± 


•079 


+ •015 ± ^230 


+ •006 ± ^058 




BD 


AC 






+ •247 ± •no 


+ -220 ± 


•092 


+ ^016 + -232 


+ •230 + ^071 




CD 


AB 






- ^078 ± ^126 


+ •089 ± 


•085 


-•511 ± -152 


- ^027 ± ^063 


Per cent, with A . . 


7-8 


7-3 




4-2 


6^8 


31 » ■*-* • • 


4-2 


10^3 




8^8 


8^5 


„ » ^ • • 


3-9 


3-4 




2^0 


3^2 


D 


5 3 


8^3 




4^5 


6^9 


„ „ any defect 


11^8 


16-6 




13-2 


14-8 


Number observed . . 


6,274 


11,090 


6,026 


23,713 



"* " All ages " includes those in Standard and in no Standard not included in the other columns. 
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at least, however, if not in four, the negative sign appears to be certainly significant ; 
I refer to the partial associations of A and B | AB | yD \ , | AB j 08 1 , and | AB | CD | , in 
which cases all the twenty-four coefficients for both sexes are negative ; and the one 
partial coefficient | BC | aD | , seven out of the eight examples of which are negative, 
the one positive value (Girls, Standards L~IIL) being only 1/1 5th of its probable error. 
The first case is the most general and remarkable, but in both it will be noted we^ 
are dealing with an association of nerve signs. As might be conjectured from the 
generality of the sign, |AB|C] and jAB|D| are also negative, i.e., when individuals 
exhibit either low nutrition or mental dulness, or both, the presence of nerve signs 
lessens the probability of development defects being present and vice versd. 

This case is most remarkable, and the following figures, showing the chance of an 
individual exhibiting development defects (or nerve signs) when he exhibits nerve 
signs (or development defects), and so on, illustrate it further. Multiplied by 100 
the chances can, of course, be read as percentages, i.e., 50 per cent, of C's are A, but 
only 42 per cent, of BC's are A (for the boys), and so on. 



A. Development Defects. B. Nerve Signs. C. Low Nutrition. D. Dulness. 



Chance of individual— 


Chance of individual — 


Who is 


bein 


gA. 


Who is 


being B. 




Boys. 


Girls. 




Boys. 


Girls. 


B 


•311 


•291 


A 


•384 


•363 


C 


•499 


•556 


C 


•471 


•435 


D 


•428 


•445 


D 


•576 


•526 


BO 


•423 


•466 


AC 


•398 


•364 


BD 


•337 


•352 


AD 


•454 


•417 


CD 


•529 


•606 


CD 


•523 


•478 


BOD 


•473 


•530 


ACD 


•468 


•418 



In every case the presence of B is antagonistic to that of A when either C or D 
is present ; and A is similarly antagonistic to B. I am quite unable to suggest any 
possible explanation of this, but so unexpected a result ought to throw some light on 
the physiological relations between the signs observed. It is particularly curious to 
note that A and B are negatively associated in the presence of either of two defects 
so apparently different as mental dulness and low nutrition. The point seems to be 
worth further investigation by the committee. "^ 

* Note 4/4/00.— Dr. E. B. Shuldham writes to me as follows : '' My experience with boys at the Bisley 
farm schools was that many of the boys who had left poor homes in London with insufficient feeding suffered 
from suppuration of the cervical glands a few weeks after their removal to Bisley, also to a great change for 
the better in food, clothing, and shelter. After a prolonged residence at Bisley the glandular enlargements 
lessened, and the suppuration ceased." This is interesting for comparison with the above, as we have here 
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Change of Association with Age, - 

§ 62. I gather from Table XXV. of the Report that the average age of the 
*' Infants " would be three or four years, Standards I.-III. about seven years, and 
Standards IV.-Ex. VII, eleven or twelve. It is unsatisfactory not having a clear 
classification of the children by age pure and simple, as a classification of Standards 
must imply an uncertain amount of selection by mental capacity. 

It is a curious point that Standards I.-III. exhibit a higher percentage of defects, 
with the exception of cases of low nutrition, than either the '' Infants " or Standards 
IV. -Ex. VII. The percentages are given in full for each separate Standard in 
Table VI. below, and in every defect there is a rise in frequency on passing from the 



Table VI. — Showing the percentages of Children with given Defects and with any 

Defect in the different Standards. (1892-94 investigation.) 

A. Development Defects. B. Nerve Signs. C. Low Nutrition. D. Mental Dulness. 



' 


• 
CD 






Boys 


)* 








Girls. 






Is 






























With 










With 






A 


B 





D 


any 


A 


B 


c 


D 


any 














defect. 










defect. 
11-8 


Infants ..... 


0-5 


7-9 


6-3 


3-7 


6-7 


14-0 


7-8 


4-2 


3-9 


5-3 


Standard I. 






6 


11-5 


13-9 


4-1 


11-0 


22-9 


8-4 


10-6 


4-7 


9-5 


17-2 


II. 






7 


9-7 


14-1 


2-9 


9-2 


21-2 


7-2 


10-3 


3-2 


8-1 


17-1 


III. . 






8 


8-2 


12-4 


2-0 


8-6 


19-3 


6-0 


10-0 


2-2 


7-1 


15-3 


IV, . 






9 


7-5 


11-8 


1-6 


7-0 


17-9 


4-5 


9-3 


2-1 


5-6 


13-7 


V. . 






10 


7-7 


9-7 


1-2 


4-9 


16-2 


4-7 


7-7 


1-8 


2-1 


12-7 


VI. . 






11 


7-4 


9-0 


1-7 


4-3 


15-6 


3-0 


9-9 


2-2 


3-5 


13-6 


VIL. 






12 


6-5 


7-0 


1-2 


2-8 


12-4 


2-5 


8-6 


2-3 


1-6 


11-9 


Ex, VII. 






13 


5-5 


6-9 


2-1 


2-8 


11-8 


3-8 


6-9 

8-5 


1-5 
3-2 


3-8 


10-7 


All ages 


• • 


• 


• 9 


8-8 


10-9 


2-8 


7-9 


18-17 


6*8 


6-9 


14-78 



" All ages " includes those in Standard and in no Standard. 



group of '' Infants '' to Standard I. — in most cases a considerable rise, the percentages 
of nerve signs more than doubling, and the percentages of development defects for 

an apparently negative association between " suppuration " and " low nutrition " in the case of boys of 
poor physique. Dr. Shuldham also states that the late Mr. H. Jones, for many years Superintendent of 
the boys at Bisley, noticed the fact that those boys who did not take green vegetables with their food 
suffered from night-blindness, which ceased when he insisted on their taking green food regularly for some 
weeks. This suggests that further inquiry as to nutrition and eyesight might be of a good deal of interest. 
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boys increasing by half; the percentage of dull also increases largely. One can only 
conclude that these defects, whether inherited or not, are not '* innate " in the simple 
sense of being observable at birth or during early infancy ; possibly they may be 
brought on in part by the school attendance itself, a theory which would account 
for the sudden rise after " Infancy," or rather early childhood. As for the subsequent 
decreases in the percentages of all defects, I am inclined to attribute them, in part 
at all events, to the selection by capacity that must take place. This would reduce 
the percentage of ^^ mentally dull," and, owing to the association between defects, 
would also reduce indirectly the percentages of those with other defects. A proportion 
of the decrease one would expect to take place from natural selection, but probably 
a small proportion. We have, in fact, selective mortality, selection by mental 
capacity, and growth (change in the individual), all acting together as causes of 
change — and very probably also initial differences between the groups from which 
the older and younger children sprang — precisely as was indicated in the previous 
discussion on pp. 299-300. 

§ 63. In the association coefficients there is,however, no irregularity in the group 
Standards I. -III. ; the most cursory inspection of Tables IV. and V. shows that the 
total coefficients and partial coefficients with negative universes decrease when we 
pass from '' Infants" to the next group, and again from Standards I.-III. to the next 
group. The changes in partial associations with positive or defective universes are 
not so obvious, partly, no doubt, because these coefficients are small and have large 
probable errors, but the majority of the changes are in the same direction. Table VII. 
shows the total number of changes in either direction. Thus comparing Standards I.- 
III. with '' Infants," we see that in the case of the boys all the total associations and all 
the partials with negative universes have decreased ; of the partial associations in 
defective universes 8 only decreased while 10 have increased. Taking all cases 
together there are 88 decreases to 32 increases, but if we cut out the partials 
22 decreases to 2 increases only. However we take it there is overwhelming 
evidence that association in general decreases as age advances. 

§64. That this is a law of pretty general application seems to be borne out by the 
entirely different class of statistics drawn from the English Census."^ In Table VIII. 
are given the associations derived from these figures between blindness and dumbness, 
blindness and mental derangement, and dumbness and mental derangement. The 
figures do not run quite regularly, but the associations for the age group 5—15 are, 
in every case, greater than the " all-ages " associations, and if we draw curves showing 
the change of association during life the downward trend is quite obvious (curves of 

figs. 1, 2, 3, p. 314). 

Such decrease is again not, of course, the result of selection alone but of selective 
mortality, growth (change in the individual), and initial differences in the child- 
populations from which the successive age groups were formed. That changes in the 

* Loc, cit.^ on p. 42. 
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individual will decrease association is obvious in the case of such a defect as blindness, 
which may be brought on by accident, cataract, or other causes having no bearing on 
mental derangement or dumbness. It is not obvious, however, why the association 
between blindness and mental derangement should be much smaller than the 
association between blindness and dumbness in the later age groups. 

§ 65. The question of what would be the effect of selection alone — or whether it 
would have any regular effect in one direction— is, however, a most interesting one. 

Table VII. — Showing the Number of Cases in which the Association Coefficient 
decreased or increased on passing from one group of Standards to the next. 
There is a great majority of decreases. 



Class of 
coefficient. 


Boys. 


Girls. 


Total. 


Infants to 

standards I.~III. 

Association. 


standards I.-III. 
to IV. -VII. 
Association. 


Infants to 

Standards I.-III, 

Association. 


Standards I.-~III. 

to IT.- yii. 

Association, 


Association. 


Decrease. 


Increase. 


Decrease. 


Increase. 


I Decrease. 

i 
I 


Increase. 


Decrease. 


Increase. 


Decrease. 


Increase. 


Total coefficients 


6 


5 


1 


6 

i 


— 


5 


1 


22 


2 


Universe with 
no other defect 


. 6 




5 


1 


i 4 

I 

! 


2 


5 


1 


20 


4 


Universe defec- 
tive . . . . 


8 


10 


14 


4 


13 


5 


11 


7 


46 


26 


Total . . . 


20 


10 


24 


6 


23 


7 


21 


9 


88 


32 



(Note,— Two coefficients that remained constant from one Standard group to the next have been 
entered as decreasing.) 

The fact that partial associations in undefective universes are higher than total associa- 
tions, combined with my first impression (unjustified I think) that associations were, on 
the whole, higher in the healthier groups, led me at first to believe that the effect of 
selection would be to increase associations. I still cannot help thinking that this is 
practically, as it is formally, possible, it being remembered that association will 
decrease or increase simply according as 



$1 s^ g s^ s^, 



h ^2 h ^4 ^^i^g the survival rates in the four classes AB, A^, &c. {vide § 54). 
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§ 66. We have no material, as already stated, for testing the effect of pure 
selection with any absolute certainty. The greater number of the deaf and dumb, 
however, possess their defect from birth or early childhood, and the same statement 
holds good for imbeciles or idiots as distinct from the insane. Hence the asso- 
ciation between deaf-mutism and imbecility will be affected by selection (selective 
mortality) to a large extent, at all events, as compared with those associations that 
were given in Table VIII. The necessary data for discussing the association in this 
case are given in the English Census for 1881 {loc. cit, note on p. 298), and the 
results are tabulated in Table IX. The figures for males show a steady and con- 
tinuous decrease in association, without a break ; for females there is, on the whole, 
a decrease^ but it is less regular."^ Taking both sexes together, I think the decrease 
in association is rather greater for dumbness and imbecility than for dumbness and 
mental derangement. Thus Table IX. seems to point to selection causing decrease of 
association. This is the only evidence we have of at all a direct character, and so its 
results should be accepted pending the production of anything better. At the same 
time the material is obviously not unimpeachable, and an endeavour should be made 
to get reliable statistics for the special purpose, t 

Differences between the Sexes. 

§ 67. The differences exhibited by the sexes as regards association are so marked 
that they can hardly have failed to have struck the reader of the foregoing tables. 
In an immense majority of cases the association is greater for females than for males 
— dealing only with the total associations that is to say. J This is true for all divisions 
of one material, and. for the Census defects as well as for those dealt with by Dr. 
Warister. The evidence is collected in Table X., which is based entirely on the 
preceding tables. In 87 cases out of 101, or 86 per cent., the associations are 
greater for females than for males. There seems some indication of a decrease in 
the difference with advancing age; thus in Standards IV. -Ex. VII. the females are 
only greatest in 3 cases of 6. In the age groups over 25, pooling Tables VII. and IX. 
together, the females only exceed in 71 per cent, of the cases, or 15 out of 21, instead 
of 86 per cent., or 18 out of 21. 

§ 68. Besides being more highly associated, women are also in general less defective 
than men. They exhibit a smaller percentage of individuals with development 
defects, nerve signs, or mental dulness, but a slightly higher percentage with low 

* It will be noted tliat the age groups are not the same as in the last case, the figures being grouped 
more coarsely in the 1881 Census. 

t Professor Pearson informs me that unpublished material in his hands goes to show that correlation 
decreases with age ; theoretically also he would expect selection to decrease correlation. 

I This is again in accord with the evidence for correlation — females being more highly correlated than 
males. 
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Diagrams Illustrating the Change in Association of Defects during Life (Table VIII.'). 
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Table IX. - — Showing the Association between Imbecility or Idiocy and Deaf-mutism 
for both Sexes and successive Age Groups ; and the proportion per 100,000 of 
Imbeciles and Deaf-mutes. 









All ages. 


5 

•943 ± '005 
•953 ± ^005 


15 


25 


45 


65 . 


Association be-"" 
tween imbe- I 
cility and | 
dumbness . . . ^ 


Male 
Female 


•924 ±'003 
•913 ±-004 


•931 ± •ooe 

•878 ±^015 


•907 ± •OOS 
•890 ±^010 


•889f014 
•910 ±'010 


•482 ±-184 
•770 ± ^053 


Proportion per 100,000 
in same age group. 


Deaf and 
dumb 

Imbeciles 

or < 
idiots 


Male 
Female 


56 

46 ! 


63 
55 


65 
51 


63 
52 


61 
47 


.61 
52 


Male 
Jj'emale 


128 i 

125 ; 

i 


97 
67 


174 
136 


156 
161 


149 
186 

■ 


217 

273 



nutrition.'^ In the case of Census defects (Tables VIII. and IX.), the proportion per 
100,000 living at all ages is less for females in blindness, deaf- mutism, and 
imbecility, but greater for females in the case of general mental derangement. 
Comparing the separate age groups the proportion per 100,000 living at each age is 
least for females in every case at age groups under twenty-five. In the older age 
groups, on the other hand, in the cases of mental derangement and imbecility, the 
females show the greater proportion of defectives. 

Thus at first sight it would appear that the female, though at first a greatly 
superior animal to the male, was liable to break down mentally at a much more rapid 
rate and at an earlier age. This, however, would probably be a fallacious conclusion ; 
it is pointed out in the Census Report for 1881 that the death-rate for male 
lunatics is known to be very much higher than for female lunatics — almost half as 
high again in fact.f Hence the greater proportion per 1000 living at the later age 
groups may, either entirely or to a large extent, be simply due to accumulation. 
There is apparently no evidence that the case-rate is greater for females at the 
higher age groups. Thus, so far as we can say at present, the female may remain 
the less defective animal throughout life. 

§ 69. Summarising the general conclusions, I have shown good ground for accepting 
as general laws that females are more highly associated than males, and that 

* Cf, Table VI., p. 54, for a comparison in each separate Standard. 

t General Report for 1881, pp. 66-67 : — "According to the returns of the Lunacy Commissioners from 
1872 to 1881 inclusively, the mean annual death-rate among the registered male insane was 11*94 per 
cent, of the average daily number on the register, while the death-rate of the females was 8*13 per cent. 
The recovery rate of the males was 10-50 per cent., and that of the females 11*59 per cent," 
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Table X. — Showing the number of Oases in which the Association between Defects is 
greater for Males and greater for Females (based on preceding tables). 



■^ Number of cases in 

which the association 

is greater. 


Table 1. 


1 

Table II. 


E 


Table III. 




1888-- 
11891. 


1892- ' 
1894. 


Certified 
indus- 
trial 
schools. 


Poor Law 
schools. 


Homes 
and 

orphan- 
ages. 


Public 
elemen- 
tary. 


nsjlish. 


Jews. 
6 


Irish. 


For males .... 






1 


1 


1 




1 


— — 


For females ... 


6 


6 


5 


5 


5 


6 


5 


6 

1 


^ Number of cases in 

which the association 

is greater. 


1 

Tables IV. and V. | 

1 


Table VIII. ! Table IX. 

.. _ ^ i.^_.„^ i 


j 
Total. 

14 


Infants. 


- 

Standards 
L-III. 


Standards 
IV.-YII. 


5 25 


25 45 


1 
45-. \ 

i 


5 25 


25—. 


For males .... 




— 


3 




2 


3 


1 
1 


1 

2 


For females .... 


6 


6 


3 


6 


4 


9 

i 
1 


87 



* Total associations only. 



associations are greater for the young than the old. I have not, however, been able 
to say with certainty whether the decrease in association was due to growth (change 
in the individual) aided by selection, or growth opposed by selection. The only 
available evidence suggested that selection acted in the same direction as growth. 

The total associations between defects were always high, except in some cases for 
the very old, but the partial associations for defective universes were in most cases 
low, and in some cases even certainly negative. The partial associations for un- 
defective universes were, on the other hand, higher than the total coefficients, a fact 
which I held to imply that no one of the defects recorded was necessary as a connecting 
link between any other pair. 

§ 70. During the course of the present investigation I have naturally been led 
to study pretty thoroughly the '' Keport " of the Committee on Childhood, and the 
papers by Dr. Wabner, bearing on the same subject, in the ' Journal of the Koyal 
Statistical Society '—papers which first drew my attention to the need of a 
theoretical study of the whole subject of association. I may be pardoned, then, for 
offering some suggestions as to the work of the Committee as regards both the mode 
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of publication or arrangement of its results, and the directions in which future work 
might possibly be undertaken. 

To deal with the questions of arrangement, &c., first. I suggest that the notation 
might with great advantage be altered to the notation of Jevons, such as I have used. 
The shortcomings of a notation which has to represent the simple second-order 
frequency (AjS) by (A+) — (AB+) are obvious. Next, I have noticed that the 
arrangement of frequencies is very irregular in the report. Where the ultimate 
fourth-order frequencies (Dr. Warner's '' Primary " groups) alone are given it is 
quite clear, but where more are given the data are seldom complete, and groups of the 
same order are not kept together. As an example of the way in w^hich I think 
frequencies ought to be given I append a table giving the general results of the 
1892-94 investigation. If so much space could not be spared, I think the statement 
of the fourth-order frequencies is quite sufficient, as the others can be so readily 
calculated from them."^ 

As regards future work, I find myself unable to follow at all the remarks made by 
the Committee on p. 5 of the ^' Report " (the italics are mine) :— 

" A very valuable addendum to vital statistics might be obtained by following 
up the history of certain cases recorded, by subsequent periodical inspections, but 
as this is beyond the power of the present Oommittee, it can only be suggested as 
one among many other directions in which enquiry may be pushed in the hands of 
official Commissions." 

The Committee — or Childhood Society — is, as I gather from its Reports, continuing 
the work of inspecting children in schools, and why it should be '' beyond their 
power" to reinspect a few large schools year by year is by no means obvious. They 
are at present (in the last Reports,! for 1898-99) issuing statistics of the frequencies 
of different groups of defects for different ages. As I have already pointed out 
(note on p. 304) these statistics are rendered almost worthless by the omission of the 
frequency (a/SyS)— the number of tmdefective children — at each age, only those who 
were defective having apparently been noted. Even if the material were, however, 
complete, it would not enable us to distinguish between changes due to . selection and 
changes due to growth, nor consequently to state what are the effects of these agencies 
each by itself Nothing but observance of one group of individuals year by year 
can do this, and there seems little more difficulty in this than in the work already 
being carried out. It is surely futile to expect a Royal Commission on the subject. 
It would, in fact, be more appropriate for a body, specially created for the '' Scientific 
Study " of childhood, to take up an investigation of the greatest scientific interest 
but probably of little immediate practical use. I would most strongly urge the 

* There appear to be a good many misprints in the Eeport, many of the frequencies being in disagree- 
ment with those given for the ultimate (" Primary ") groups. I have generally assumed the "primary " 
frequencies to be correct. 

t British Association Eeports, 1898, p. 691 ; 1899, p. 489. 
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Table showing the Frequencies of all Groups of Defects. All Ages. 

(1892-94 investigation.) 



i 

Group. 


Frequency. 


Group. 
a/3y 


Frequency, 


Group. 


Frequency. 


Boys. 


Girls. 
20,207 i 


Boys. 

21,842- \ 


Girls. 
20,504 i 


Boys. 
22^,013 


Girls. 
20,667 


1 
apyh 


21,511 


A/3y8 


802 


445 \ 


oc/38 


21,619 i 


20,317 


ay 


23,604 


21,753 


aByS 


1,059 


762 1 


ocy8 


22,570 1 


20,969 


a8 


22,793 


21,188 


oc^GS 


108 


110 1 


f3y8 


22,313 


20,652 


Py 


23,038 


21,263 


apyD 


331 


297 

1 


A^y 


1,196 i 


759 


/38 


22,565 


20,924 


AByS 


415 


207 


A/?5 


936 


607 


y8 


23,787 


21,621 


A^CS 


134 


162 


. AyS 


1,217 


652 


A/3 


1,421 


1,031 


A/5yD 


394 


314 


ByS 


1,474 


969 


Ay 


1,934 


1,190 


aBOS 


115 


109 


aBy 


1,762 


1,249 


AS 


1,420 


891 


aByl) 


703 


487 


i aB8 


1,174 


871 


aB 


1,966 


1,428 


a/3CB 


63 


53 


! oc^G 


171 


163 


By 


2,500 


1,680 


ABCa 


69 


77 


ocGS 


223 


219 


B5 


1,658 


1,155 


AByD 


323 


224 


/3G8 


242 


272 


aG 


375 


342 


A/5CD 


91 


110 


ocpB 


394 


350 


/3G 


396 


435 


C6BGD 


89 


70 1 


1 /5yi> 


725 


611 


G8 


426 


458 


ABCD 


80 


79 


i ayD 


1,034 


784 


oD 


1,186 


907 








1 ABy 
AB8 


7.S8 


4-31 


^D 
yD 


87Q 


774 


Total . 


.-. , , „ ^, „, 

26,287 


23,713 


1 tJO 

484 


284 


O 1 J/ 

1,751 


1 1 tt 

1,322 




; 




; A^c 


225 


272 


AB 


887 


587 








ACS 


203 


239 


AC 


374 


428 








i A^D 


485 


424 


AD 


888 


727 




j 




1 AyD 


717 


538 


BC 


353 


335 




i 




j aBC 


204 


179 


BD 


1,195 


860 




1 




BGS 


184 


186 


CD 


323 


312 




1 




ByD 
aBD 


1,026 
792 


711 










! 




1 1 JL 

557 


A 


2,308 


1,618 




\ 


1 


1 ocGB 


152 


123 


a 


23,979 


22,095 




1 


i 

1 


1 /5CD 


154 


163 


B 


1 2,853 


2,015 




\ 


j 


1 ABO 


149 


156 


^ 


23,434 


21,698 






i 


i ABD 


403 


303 


i c 


749 


770 






! 


AOD 


471 


1 189 


i r 


25,538 


22,943 






i 
- 
* 


BCD 


169 


149 


! D 


2,074 


1,634 




1 


1 




1 




j 8 
1 Total . 


24,213 


22,079 


26,287 


23,713 



importance of such reinspections of the same children on the Childhood Society 
and the Committee of the British Association that corresponds with it. 

As a further subject, I suggest the question as to the hereditary character of 
these defects — development defects, nerve signs, low nutrition, and mental 
dulness — not necessarily by studying the parents, which would probably be a 
difficult matter, but by noting pairs of brothers. Suppose a large number of groups 
of brothers'^— " fraternities ^' — to be noted for some defect, say A ; reckon for each 

"^ Or, of course, sisters; or brothers and sisters, " geschwister," " siblings " as Professor Pearson has 
proposed to call them, sinc^ we have no modern English word for members of the same family with- 
out regard to sex, 
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fraternity the number of A A, Aa, a A, and aa pairs, as in the example on ^' Temper 
in Fraternities/' p. 291, and tabulate the total number of such separate pairs as in 
that example. These numbers give the association between brothers at once. 

A certain portion of such fraternal association might be due to similarity of 
environment for the brothers, if the defects observed were much affected by this. 
I do not see how the home-environment could be allowed for, but it could be tested 
whether the environment of school had any such effect. Take from a considerable 
number — say 100 — different schools a series of samples, say 50 or 100 children 
from each. Each of these groups forms what we may call a '' community " or 
group subjected to common conditions, as opposed to the fraternity. Find the 
association between members of the Community in just the same way as the 
association between members of the Fraternity. This would give a measure of the 
effect of environment as opposed to inheritance. Possibly this might be done with 
the material now in the hands of the Society. 



